Re: Looking for Maintainers for Flink on YARN

2022-01-28 Thread
Hello, I would like to maintain the yarn component. In our company, we are using yarn to schedule flink mainly. The stability is importance to us. I will spend some time to check the issues. Konstantin Knauf 于2022年1月26日周三 17:17写道: > Hi everyone, > > We are seeing an increasing number of test

Re: [VOTE] FLIP-199: Change some default config values of blocking shuffle for better usability

2022-01-11 Thread
+1 for the proposal. In fact, we have used these params in our inner flink version for good performance. Yun Gao 于2022年1月12日周三 10:42写道: > +1 since it would highly improve the open-box experience for batch jobs. > > Thanks Yingjie for drafting the PR and initiating the discussion. > > Best, >

Re: [DISCUSS] Change some default config values of blocking shuffle

2022-01-04 Thread
> > >>>> >> > Hi Jiangang, >>>> >> > >>>> >> > Thanks for your suggestion. >>>> >> > >>>> >> > >>> The config can affect the memory usage. Will the related >>>> memor

Re: Re: [DISCUSS] Introduce Hash Lookup Join

2021-12-29 Thread
Thank you for the proposal, Jing. I like the idea to partition data by some key to improve the cache hit. I have some questions: 1. When it comes to hive, how do you load partial data instead of the whole data? Any change related with hive? 2. How to define the cache configuration? For

Re: [DISCUSS] FLIP-198: Working directory for Flink processes

2021-12-12 Thread
I like the idea. It can reuse the disk to do many things. Isn't it only for inner failover? If not, the cleaning may be a problem. Also, many resource components have their own disk schedule strategy. Chesnay Schepler 于2021年12月12日周日 19:59写道: > How do you intend to handle corrupted files, in

Re: [ANNOUNCE] New Apache Flink Committer - Ingo Bürk

2021-12-12 Thread
Congratulations! Best Liu Jiangang Nicholas Jiang 于2021年12月13日周一 11:28写道: > Congratulations, Ingo! > > Best, > Nicholas Jiang >

Re: [ANNOUNCE] New Apache Flink Committer - Matthias Pohl

2021-12-12 Thread
Congratulations! Best Liu Jiangang Nicholas Jiang 于2021年12月13日周一 11:23写道: > Congratulations, Matthias! > > Best, > Nicholas Jiang >

Re: [DISCUSS] FLIP-168: Speculative execution for Batch Job

2021-12-12 Thread
Any progress on the feature? We have the same requirement in our company. Since the soft and hard environment can be complex, it is normal to see a slow task which determines the execution time of the flink job. 于2021年6月20日周日 22:35写道: > Hi everyone, > > I would like to kick off a discussion on

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-10 Thread
Glad to see the suggestion. In our test, we found that small jobs with the changing configs can not improve the performance much just as your test. I have some suggestions: - The config can affect the memory usage. Will the related memory configs be changed? - Can you share the tpcds

Re: [ANNOUNCE] New Apache Flink Committer - Ingo Bürk

2021-12-02 Thread
Congratulations! Best, Liu Jiangang Till Rohrmann 于2021年12月2日周四 下午11:24写道: > Hi everyone, > > On behalf of the PMC, I'm very happy to announce Ingo Bürk as a new Flink > committer. > > Ingo has started contributing to Flink since the beginning of this year. He > worked mostly on SQL

Re: [ANNOUNCE] New Apache Flink Committer - Matthias Pohl

2021-12-02 Thread
Congratulations! Best, Liu Jiangang Till Rohrmann 于2021年12月2日周四 下午11:28写道: > Hi everyone, > > On behalf of the PMC, I'm very happy to announce Matthias Pohl as a new > Flink committer. > > Matthias has worked on Flink since August last year. He helped review a ton > of PRs. He worked on a

Re: [ANNOUNCE] Open source of remote shuffle project for Flink batch data processing

2021-11-30 Thread
Good work for flink's batch processing! Remote shuffle service can resolve the container lost problem and reduce the running time for batch jobs once failover. We have investigated the component a lot and welcome Flink's native solution. We will try it and help improve it. Thanks, Liu Jiangang

Re: [ANNOUNCE] New Apache Flink Committer - Jing Zhang

2021-11-22 Thread
Congratulations! Matthias Pohl 于2021年11月22日周一 下午4:10写道: > Congratulations :-) > > On Thu, Nov 18, 2021 at 3:23 AM Jingsong Li > wrote: > > > Congratulations, Jing! Well deserved! > > > > On Wed, Nov 17, 2021 at 3:00 PM Lincoln Lee > > wrote: > > > > > > Congratulations, Jing! > > > > > >

Re: [DISCUSS] Improve the name and structure of job vertex and operator name for job

2021-11-20 Thread
+1 for the FLIP. We have met the problem that a long name stuck the metric collection for SQL jobs. wenlong.lwl 于2021年11月19日周五 下午10:29写道: > hi, yun, > Thanks for the suggestion, but I am not sure whether we need such a prefix > or not, because the log has included vertex id, when the name is

Re: [DISCUSS] FLIP-185: Shorter heartbeat timeout and interval default values

2021-07-22 Thread
Thanks, Till. There are many reasons to reduce the heartbeat interval and timeout. But I am not sure what values are suitable. In our cases, the GC time and big job can be related factors. Since most flink jobs are pipeline and a total failover can cost some time, we should tolerate some

Re: Job Recovery Time on TM Lost

2021-07-12 Thread
Yes, time is main when detecting the TM's liveness. The count method will check by certain intervals. Gen Luo 于2021年7月9日周五 上午10:37写道: > @刘建刚 > Welcome to join the discuss and thanks for sharing your experience. > > I have a minor question. In my experience, network failures

Re: [DISCUSS] FLIP-182: Watermark alignment

2021-07-12 Thread
+1 for the source watermark alignment. In the previous flink version, the source connectors are different in implementation and it is hard to make this feature. When the consumed data is not aligned or consuming history data, it is very easy to cause the unalignment. Source alignment can resolve

Re: Job Recovery Time on TM Lost

2021-07-07 Thread
It is really helpful to find the lost container quickly. In our inner flink version, we optimize it by task's report and jobmaster's probe. When a task fails because of the connection, it reports to the jobmaster. The jobmaster will try to confirm the liveness of the unconnected taskmanager for

Re: [ANNOUNCE] New Apache Flink Committer - Yang Wang

2021-07-06 Thread
Congratulations, Yang Wang. Best Jiangang Liu Leonard Xu 于2021年7月7日周三 上午10:27写道: > Congratulations! Yang Wang > > > Best, > Leonard > > 在 2021年7月7日,10:23,tison 写道: > > > > Congratulations and well deserved! > > > > It is my pressure to work with you excellent developer. > > > > Best, > >

Re: [ANNOUNCE] New PMC member: Guowei Ma

2021-07-06 Thread
Congratulations,Guowei Ma. Best Jiangang Liu tison 于2021年7月7日周三 上午10:24写道: > Congrats! NB. > > Best, > tison. > > > Jark Wu 于2021年7月7日周三 上午10:20写道: > > > Congratulations Guowei! > > > > Best, > > Jark > > > > On Wed, 7 Jul 2021 at 09:54, XING JIN wrote: > > > > > Congratulations, Guowei~ ! >

Re: [DISCUSS] Better user experience in the WindowAggregate upon Changelog (contains update message)

2021-07-01 Thread
Thanks for the discussion, JING ZHANG. I like the first proposal since it is simple and consistent with dataStream API. It is helpful to add more docs about the special late case in WindowAggregate. Also, I expect the more flexible emit strategies later. Jark Wu 于2021年7月2日周五 上午10:33写道: > Sorry,

Re: [VOTE] FLIP-147: Support Checkpoint After Tasks Finished

2021-06-28 Thread
+1 (binding) Best liujiangang Piotr Nowojski 于2021年6月29日周二 上午2:05写道: > +1 (binding) > > Piotrek > > pon., 28 cze 2021 o 12:48 Dawid Wysakowicz > napisał(a): > > > +1 (binding) > > > > Best, > > > > Dawid > > > > On 28/06/2021 10:45, Yun Gao wrote: > > > Hi all, > > > > > > For FLIP-147[1]

Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-28 Thread
> > > > > > > should > > > > > > > > > > enable > > > > > > > > > > > > the > > > > > > > > > > > > > > case again no matter if it is fixed or not. >

Re: [VOTE] FLIP-171: Async Sink

2021-06-24 Thread
A common base for implementing sink is really helpful. +1 (binding) Thanks liujiangang Danny Cranmer 于2021年6月24日周四 下午7:10写道: > Looking forward to simplifying adding new destination sinks. > > +1 (binding). > > Thanks > > On Thu, 24 Jun 2021, 10:46 Arvid Heise, wrote: > > > Thanks for

Re: Re: [ANNOUNCE] New PMC member: Arvid Heise

2021-06-24 Thread
Congratulations Best liujiangang Matthias Pohl 于2021年6月23日周三 下午2:11写道: > Congratulations, Arvid! :-) > > On Thu, Jun 17, 2021 at 9:02 AM Arvid Heise wrote: > > > Thank you for your trust and support. > > > > Arvid > > > > On Thu, Jun 17, 2021 at 8:39 AM Roman Khachatryan > > wrote: > > > > >

Re: [VOTE] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-23 Thread
+1 (binding) Thanks liujiangang Zhu Zhu 于2021年6月24日周四 上午11:38写道: > +1 (binding) > > Thanks, > Zhu > > Yangze Guo 于2021年6月21日周一 下午3:42写道: > > > According to the latest comment of Zhu Zhu[1], I append the potential > > resource deadlock in batch jobs as a known limitation to this FLIP. > >

Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-22 Thread
It is a good principle to run all tests successfully with any change. This means a lot for project's stability and development. I am big +1 for this proposal. Best liujiangang Till Rohrmann 于2021年6月22日周二 下午6:36写道: > One way to address the problem of regularly failing tests that block >

Re: [ANNOUNCE] New PMC member: Xintong Song

2021-06-16 Thread
Congrats, Xintong! Best. Xintong Song 于2021年6月16日周三 下午5:51写道: > Thanks all for the support. > It's my honor to be part of the community and work with all of you great > people. > > Thank you~ > > Xintong Song > > > On Wed, Jun 16, 2021 at 5:31 PM Konstantin Knauf > wrote: > > >

Re: [ANNOUNCE] New PMC member: Arvid Heise

2021-06-16 Thread
Congratulations, Arvid! Best Xintong Song 于2021年6月16日周三 下午5:51写道: > Congratulations, Arvid~! > > Thank you~ > > Xintong Song > > > > On Wed, Jun 16, 2021 at 5:31 PM Konstantin Knauf > wrote: > > > Congratulations, Arvid! > > > > On Wed, Jun 16, 2021 at 11:30 AM Yangze Guo wrote: > > > > >

Re: Add control mode for flink

2021-06-11 Thread
is problem. > > Cheers, > Till > > On Fri, Jun 11, 2021 at 9:51 AM Jary Zhen <[hidden email] > <http:///user/SendEmail.jtp?type=node=44392=0>> wrote: > >> big +1 for this feature, >> >>1. Reset kafka offset in certain cases. >>2. Stop chec

Re: Add control mode for flink

2021-06-10 Thread
o agree with the summarization by Xintong and Jing that control > >>>> flow seems to be > >>>> a common buidling block for many functionalities and dynamic > >>>> configuration framework > >>>> is a representative application that frequently required by users. > >>>> Rega

Re: Re: Add control mode for flink

2021-06-08 Thread
o broadcast an event through the iteration body >>> to detect if there are still >>> records reside in the iteration body). And regarding whether to >>> implement the dynamic configuration >>> framework, I also agree with Xintong that the consistency guarantee >>> would be a point to consider, we >>> might co

Re: Add control mode for flink

2021-06-07 Thread
s is for sure one possible > approach. The reason we are in favor of introducing the control flow is > that: > - It benefits not only this specific dynamic controlling feature, but > potentially other future features as well. > - AFAICS, it's non-trivial to make a 3rd-party dynamic configur

Re: Add control mode for flink

2021-06-06 Thread
//medium.com/twodigits/dynamic-app-configuration-inject-configuration-at-run-time-using-spring-boot-and-docker-ffb42631852a > > On Fri, Jun 4, 2021 at 7:09 AM 刘建刚 wrote: > >> Hi everyone, >> >> Flink jobs are always long-running. When the job is running, users >

Add control mode for flink

2021-06-04 Thread
Hi everyone, Flink jobs are always long-running. When the job is running, users may want to control the job but not stop it. The control reasons can be different as following: 1. Change data processing’ logic, such as filter condition. 2. Send trigger events to make the

Re: [VOTE] FLIP-143: Unified Sink API

2020-09-29 Thread
+1 (binding) Best, Liu Jiangang Jingsong Li 于2020年9月29日周二 下午1:36写道: > +1 (binding) > > Best, > Jingsong > > On Mon, Sep 28, 2020 at 3:21 AM Kostas Kloudas wrote: > > > +1 (binding) > > > > @Steven Wu I think there will be opportunities to fine tune the API > > during the implementation. > > >

Re: [ANNOUNCE] New Apache Flink Committer - Arvid Heise

2020-09-15 Thread
Congratulations! Best Matthias Pohl 于2020年9月15日周二 下午6:07写道: > Congratulations! ;-) > > On Tue, Sep 15, 2020 at 11:47 AM Xingbo Huang wrote: > > > Congratulations! > > > > Best, > > Xingbo > > > > Igal Shilman 于2020年9月15日周二 下午5:44写道: > > > > > Congrats Arvid! > > > > > > On Tue, Sep 15, 2020

Re: [ANNOUNCE] New Apache Flink Committer - Niels Basjes

2020-09-14 Thread
Congratulations! Best, liujiangang Danny Chan 于2020年9月15日周二 上午9:44写道: > Congratulations!  > > Best, > Danny Chan > 在 2020年9月15日 +0800 AM9:31,dev@flink.apache.org,写道: > > > > Congratulations!  >

Re: [ANNOUNCE] Yu Li became a Flink committer

2020-01-24 Thread
Congratulations! > 2020年1月23日 下午4:59,Stephan Ewen 写道: > > Hi all! > > We are announcing that Yu Li has joined the rank of Flink committers. > > Yu joined already in late December, but the announcement got lost because > of the Christmas and New Years season, so here is a belated proper >

Re: java.lang.StackOverflowError

2020-01-21 Thread
I am using flink 1.6.2 on yarn. State backend is rocksdb. > 2020年1月22日 上午10:15,刘建刚 写道: > > I have a flink job which fails occasionally. I am eager to avoid this > problem. Can anyone help me? The error stacktrace is as following: > java.io.IOException: java.lang.Sta

java.lang.StackOverflowError

2020-01-21 Thread
I have a flink job which fails occasionally. I am eager to avoid this problem. Can anyone help me? The error stacktrace is as following: java.io.IOException: java.lang.StackOverflowError at

How to get kafka record's timestamp in job

2019-12-31 Thread
In kafka010, ConsumerRecord has a field named timestamp. It is encapsulated in Kafka010Fetcher. How can I get the timestamp when I write a flink job? Thank you very much.

Re: Error:java: 无效的标记: --add-exports=java.base/sun.net.util=ALL-UNNAMED

2019-11-21 Thread
mg)" >> >> Hope it works for you >> Thanks. >> >> [1]. https://www.jetbrains.com/idea/download/other.html >> >> >> On Mon, Nov 4, 2019 at 5:44 PM Till Rohrmann wrote: >> >>> Try to reimport that maven project. This should re

Re: How to estimate the memory size of flink state

2019-11-20 Thread
ail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1=sysukelee=sysukelee%40gmail.com=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png=%5B%22sysukelee%40gmail.com%22%5D> > On 11/20/2019 15:08,刘建刚 > <mailto:liujiangangp...@gmail.com> wrote: > We are using flin

How to estimate the memory size of flink state

2019-11-19 Thread
We are using flink 1.6.2. For filesystem backend, we want to monitor the state size in memory. Once the state size becomes bigger, we can get noticed and take measures such as rescaling the job, or the job may fail because of the memory. We have tried to get the memory usage for the

How to use two continuously window with EventTime in sql

2019-10-29 Thread
For one sql window, I can register table with event time and use time field in the tumble window. But if I want to use the result for the first window and use another window to process it, how can I do it? Thank you.

Uncertain result when using group by in stream sql

2019-09-13 Thread
I use flink stream sql to write a demo about "group by". The records are [(bj, 1), (bj, 3), (bj, 5)]. I group by the first element and sum the second element. Every time I run the program, the result is different. It seems that the records are out of order. Even sometimes record is

How to implement grouping set in stream

2019-09-10 Thread
I want to implement grouping set in stream. I am new to flink sql. I want to find a example to teach me how to self define rule and implement corresponding operator. Can anyone give me any suggestion?

How to calculate one day's uv every minute by SQL

2019-09-04 Thread
We want to calculate one day's uv and show the result every minute . We have implemented this by java code: dataStream.keyBy(dimension) .incrementWindow(Time.days(1), Time.minutes(1)) .uv(userId) The input data is big. So we use

How to load udf jars in flink program

2019-08-15 Thread
We are using per-job to load udf jar when start job. Our jar file is in another path but not flink's lib path. In the main function, we use classLoader to load the jar file by the jar path. But it reports the following error when job starts running. If the jar file is in lib,

How to convert protobuf to Row

2019-05-06 Thread
I read byte data from Kafka. I use a class ProtoSchema implemented DeserializationSchema to get the actual java class. My question is that how can I transfer the byte data to Row just by ProtoSchema? What if the data structure is nested? Thank you.

Containers are not released after job failed

2019-04-26 Thread
I run flink 1.6.2 on yarn. At some time, job is failed becuase of: org.apache.flink.util.FlinkException: The assigned slot container_e708_1555051789618_2644286_01_61_0 was removed Then the job restarts. After some time, the container container_e708_1555051789618_2644286_01_61

One source is much slower than the other side when join history data

2019-02-26 Thread
When consuming history data in join operator with eventTime, reading data from one source is much slower than the other. As a result, the join operator will cache much data from the faster source in order to wait the slower source. The question is that how can I make the difference of

request for access

2019-02-13 Thread
Hi Guys, I want to contribute to Apache Flink. Would you please give me the permission as a contributor? My JIRA Username is Jiangang. My JIRA full name is Liu.