Re: [DISCUSS] FLIP-27: Refactor Source Interface

2019-12-04 Thread bupt_ljy
Hi Becket,


Thanks for updating the progress!


I have a question about the #OperatorCoordinator. Will there be any 
communication between different #OperatorCoordinators (or in the future plan)? 
Because in that way it may be able to cover some cases in FLIP-27[1] like 
initializing static data before main input processing. Of course it requires 
more thinking, just want to speak up some ideas in my mind.


+1 to the FLIP and detailed design!




[1]. 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API


Best,
Jiayi Liao


 Original Message 
Sender: Stephan Ewen
Recipient: dev
Date: Wednesday, Dec 4, 2019 18:25
Subject: Re: [DISCUSS] FLIP-27: Refactor Source Interface


Thanks, Becket, for updating this. I agree with moving the aspects you 
mentioned into separate FLIPs - this one way becoming unwieldy in size. +1 to 
the FLIP in its current state. Its a very detailed write-up, nicely done! On 
Wed, Dec 4, 2019 at 7:38 AM Becket Qin  wrote: > Hi all, 
> > Sorry for the long belated update. I have updated FLIP-27 wiki page with > 
the latest proposals. Some noticeable changes include: > 1. A new generic 
communication mechanism between SplitEnumerator and > SourceReader. > 2. Some 
detail API method signature changes. > > We left a few things out of this FLIP 
and will address them in separate > FLIPs. Including: > 1. Per split event 
time. > 2. Event time alignment. > 3. Fine grained failover for SplitEnumerator 
failure. > > Please let us know if you have any question. > > Thanks, > > 
Jiangjie (Becket) Qin > > On Sat, Nov 16, 2019 at 6:10 AM Stephan Ewen 
 wrote: > > > Hi Łukasz! > > > > Becket and me are working 
hard on figuring out the last details and > > implementing the first PoC. We 
would update the FLIP hopefully next week. > > > > There is a fair chance that 
a first version of this will be in 1.10, but > I > > think it will take another 
release to battle test it and migrate the > > connectors. > > > > Best, > > 
Stephan > > > > > > > > > > On Fri, Nov 15, 2019 at 11:14 AM Łukasz 
Jędrzejewski  > wrote: > > > > > Hi, > > > > > > This proposal 
looks very promising for us. Do you have any plans in > which > > > Flink 
release it is going to be released? We are thinking on using a > Data > > > Set 
API for our future use cases but on the other hand Data Set API is > > > going 
to be deprecated so using proposed bounded data streams solution > > > could be 
more viable in the long term. > > > > > > Thanks, > > > Łukasz > > > > > > On 
2019/10/01 15:48:03, Thomas Weise  wrote: > > > > 
Thanks for putting together this proposal! > > > > > > > > I see that the "Per 
Split Event Time" and "Event Time Alignment" > > sections > > > > are still 
TBD. > > > > > > > > It would probably be good to flesh those out a bit before 
proceeding > > too > > > far > > > > as the event time alignment will probably 
influence the interaction > > with > > > > the split reader, specifically 
ReaderStatus emitNext(SourceOutput > > > > output). > > > > > > > > We 
currently have only one implementation for event time alignment in > > the > > 
> > Kinesis consumer. The synchronization in that case takes place as the > > > 
last > > > > step before records are emitted downstream (RecordEmitter). With 
the > > > > currently proposed interfaces, the equivalent can be implemented in 
> the > > > > reader loop, although note that in the Kinesis consumer the per 
shard > > > > threads push records. > > > > > > > > Synchronization has not 
been implemented for the Kafka consumer yet. > > > > > > > > 
https://issues.apache.org/jira/browse/FLINK-12675 > > > > > > > > When I looked 
at it, I realized that the implementation will look > quite > > > > different > 
> > > from Kinesis because it needs to take place in the pull part, where > > > 
records > > > > are taken from the Kafka client. Due to the multiplexing it 
cannot be > > > done > > > > by blocking the split thread like it currently 
works for Kinesis. > > Reading > > > > from individual Kafka partitions needs 
to be controlled via > > pause/resume > > > > on the Kafka client. > > > > > > 
> > To take on that responsibility the split thread would need to be > aware > 
> of > > > > the > > > > watermarks or at least whether it should or should not 
continue to > > > consume > > > > a given split and this may require a 
different SourceReader or > > > SourceOutput > > > > interface. > > > > > > > > 
Thanks, > > > > Thomas > > > > > > > > > > > > On Fri, Jul 26, 2019 at 1:39 AM 
Biao Liu  wrote: > > > > > > > > > Hi Stephan, > > > > > > 
> > > > Thank you for feedback! > > > > > Will take a look at your branch 
before public discussing. > > > > > > > > > > > > > > > On Fri, Jul 26, 2019 at 
12:01 AM Stephan Ewen  > > > wrote: > > > > > > > > > > > Hi 
Biao! > > > > > > > > > > > > Thanks for reviving this. I would like to join 
this discussion, > but > > > am > > > > > > quite occupied with the 1.9 
release, so can we maybe pause this >

Re: [ANNOUNCE] Jark Wu is now part of the Flink PMC

2019-11-08 Thread bupt_ljy
Congratulations Jark!


Best,
Jiayi Liao


 Original Message 
Sender: Yun Gao
Recipient: dev
Date: Friday, Nov 8, 2019 18:37
Subject: Re: [ANNOUNCE] Jark Wu is now part of the Flink PMC


Congratulations Jark! Best, Yun 
-- 
From:wenlong.lwl  Send Time:2019 Nov. 8 (Fri.) 18:31 
To:dev  Subject:Re: [ANNOUNCE] Jark Wu is now part of the 
Flink PMC Congratulations Jark, well deserved! Best, Wenlong Lyu On Fri, 8 Nov 
2019 at 18:22, tison  wrote: > Congrats Jark! > > Best, > 
tison. > > > Jingsong Li  于2019年11月8日周五 下午6:08写道: > > > 
Congratulations to Jark. > > Jark has really contributed a lot to the table 
layer with a long time. > Well > > deserved. > > > > Best, > > Jingsong Lee > > 
> > On Fri, Nov 8, 2019 at 6:05 PM Yu Li  wrote: > > > > > 
Congratulations Jark! Well deserved! > > > > > > Best Regards, > > > Yu > > > > 
> > > > > On Fri, 8 Nov 2019 at 17:55, OpenInx  wrote: > > > 
> > > > Congrats Jark ! Well deserve. > > > > > > > > On Fri, Nov 8, 2019 at 
5:53 PM Paul Lam  > wrote: > > > > > > > > > Congrats 
Jark! > > > > > > > > > > Best, > > > > > Paul Lam > > > > > > > > > > > 在 
2019年11月8日,17:51,jincheng sun  写道: > > > > > > > > > 
> > > Hi all, > > > > > > > > > > > > On behalf of the Flink PMC, I'm happy to 
announce that Jark Wu is > > now > > > > > > part of the Apache Flink Project 
Management Committee (PMC). > > > > > > > > > > > > Jark has been a committer 
since February 2017. He has been very > > > active > > > > on > > > > > > 
Flink's Table API / SQL component, as well as frequently helping > > > > > > 
manage/verify/vote releases. He has been writing many blogs about > > > > 
Flink, > > > > > > also driving the translation work of Flink website and > > 
documentation. > > > > He > > > > > is > > > > > > very active in China 
community as he gives talks about Flink at > > many > > > > > events > > > > > 
> in China. > > > > > > > > > > > > Congratulations & Welcome Jark! > > > > > > 
> > > > > > Best, > > > > > > Jincheng (on behalf of the Flink PMC) > > > > > > 
> > > > > > > > > > > > > > > > > -- > > Best, Jingsong Lee > > >

Re: RocksDB state on HDFS seems not being cleanned up

2019-11-05 Thread bupt_ljy
Hi Shuwen,


The “shared” means that the state files are shared among multiple checkpoints, 
which happens when you enable incremental checkpointing[1]. Therefore, it’s 
reasonable that the size keeps growing if you set 
“state.checkpoint.num-retained” to be a big value.


[1] https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html


Best,
Jiayi Liao


 Original Message 
Sender: shuwen zhou
Recipient: dev
Date: Tuesday, Nov 5, 2019 17:59
Subject: RocksDB state on HDFS seems not being cleanned up


Hi Community, I have a job running on Flink1.9.0 on YARN with rocksDB on HDFS 
with incremental checkpoint enabled. I have some MapState in code with 
following config: val ttlConfig = StateTtlConfig .newBuilder(Time.minutes(30) 
.updateTtlOnCreateAndWrite() .cleanupInBackground() .cleanupFullSnapshot() 
.setStateVisibility(StateTtlConfig.StateVisibility.ReturnExpiredIfNotCleanedUp) 
After running for around 2 days, I observed checkpoint folder is showing 44.4 M 
/flink-chk743e4568a70b626837b/chk-40 65.9 M 
/flink-chk743e4568a70b626837b/chk-41 91.7 M 
/flink-chk743e4568a70b626837b/chk-42 96.1 M 
/flink-chk743e4568a70b626837b/chk-43 48.1 M 
/flink-chk743e4568a70b626837b/chk-44 71.6 M 
/flink-chk743e4568a70b626837b/chk-45 50.9 M 
/flink-chk743e4568a70b626837b/chk-46 90.2 M 
/flink-chk743e4568a70b626837b/chk-37 49.3 M 
/flink-chk743e4568a70b626837b/chk-38 96.9 M 
/flink-chk743e4568a70b626837b/chk-39 797.9 G 
/flink-chk743e4568a70b626837b/shared The ./shared folder size seems continuing 
increasing and seems the folder is not being clean up. However while I disabled 
incremental cleanup, the expired full snapshot will be removed automatically. 
Is there any way to remove outdated state on HDFS to stop it from increasing? 
Thanks. -- Best Wishes, Shuwen Zhou

Re: [DISCUSS] Semantic and implementation of per-job mode

2019-10-31 Thread bupt_ljy
Hi all,


Firstly thanks @tison for bring this up and strongly +1 for the overall design. 


I’d like to add one more example of "multiple jobs in one program" with what 
I’m currently working on. I’m trying to run a TPC-DS benchmark testing 
(including tens of sql query job) on Flink and sufferring a lot from 
maintaining the client because I can’t run this program in per-job mode and 
have to make the client attached. 


Back to our discussion, I can see now there is a divergence of compiling the 
job graph between in client and in #ClusterEntrypoint. And up and downsides 
exist in either way. As for the opt-in solution, I have a question, what if the 
user chooses detach mode, compiling in the client and runs a multi-job program 
at the same time? And it still not gonna work.
Besides, by adding an compiling option, we need to consider more things when 
submitting a job like "Is my program including multiple job?" or "Does the 
program need to be initialized before submitting to a remote cluster?", which 
looks a bit complicated and confusing to me.


By summarizing, I'll vote for the per-program new concept but I may not prefer 
the opt-in option mentioned in the mailing list or maybe we need to reconsider 
a better concept and definition which is easy to understand.




Best,
Jiayi Liao


 Original Message 
Sender: Rong Rong
Recipient: Regina" 
Cc: Theo Diefenthal; 
u...@flink.apache.org
Date: Friday, Nov 1, 2019 11:01
Subject: Re: [DISCUSS] Semantic and implementation of per-job mode


Hi All,


Thanks @Tison for starting the discussion and I think we have very similar 
scenario with Theo's use cases. 
In our case we also generates the job graph using a client service (which 
serves multiple job graph generation from multiple user code) and we've found 
that managing the upload/download between the cluster and the DFS to be trick 
and error-prone. In addition, the management of different environment and 
requirement from different user in a single service posts even more trouble for 
us.


However, shifting the job graph generation towards the cluster side also 
requires some thoughts regarding how to manage the driver-job as well as some 
dependencies conflicts - In the case for shipping the job graph generation to 
the cluster, some unnecessary dependencies for the runtime will be pulled in by 
the driver-job (correct me if I were wrong Theo)



I think in general I agree with @Gyula's main point: unless there is a very 
strong reason, it is better if we put the driver-mode as an opt-in (at least at 
the beginning). 

I left some comments on the document as well. Please kindly take a look.


Thanks,
Rong


On Thu, Oct 31, 2019 at 9:26 AM Chan, Regina  wrote:

Yeah just chiming in this conversation as well. We heavily use multiple job 
graphs to get isolation around retry logic and resource allocation across the 
job graphs. Putting all these parallel flows into a single graph would mean 
sharing of TaskManagers across what was meant to be truly independent.
 
We also build our job graphs dynamically based off of the state of the world at 
the start of the job. While we’ve had a share of the pain described, my 
understanding is that there would be a tradeoff in number of jobs being 
submitted to the cluster and corresponding resource allocation requests. In the 
model with multiple jobs in a program, there’s at least the opportunity to 
reuse idle taskmanagers. 
 
 
 
 
From: Theo Diefenthal  
 Sent: Thursday, October 31, 2019 10:56 AM
 To: u...@flink.apache.org
 Subject: Re: [DISCUSS] Semantic and implementation of per-job mode
 
I agree with Gyula Fora,
 
In our case, we have a client-machine in the middle between our YARN cluster 
and some backend services, which can not be reached directly from the cluster 
nodes. On application startup, we connect to some external systems, get some 
information crucial for the job runtime and finally build up the job graph to 
be committed.
 
It is true that we could workaround this, but it would be pretty annoying to 
connect to the remote services, collect the data, upload it to HDFS, start the 
job and make sure, housekeeping of those files is also done at some later time. 
 
The current behavior also corresponds to the behavior of Sparks driver mode, 
which made the transition from Spark to Flink easier for us. 
 
But I see the point, especially in terms of Kubernetes and would thus also vote 
for an opt-in solution, being the client compilation the default and having an 
option for the per-program mode as well.
 
Best regards
 
Von: "Flavio Pompermaier" 
 An: "Yang Wang" 
 CC: "tison" , "Newport, Billy" , 
"Paul Lam" , "SHI Xiaogang" , 
"dev" , "user" 
 Gesendet: Donnerstag, 31. Oktober 2019 10:45:36
 Betreff: Re: [DISCUSS] Semantic and implementation of per-job mode
 
Hi all, 
we're using a lot the multiple jobs in one program and this is why: when you 
fetch data from a huge number of sources and, for each source, you do some 
transformation and then you 

Re: [ANNOUNCE] Becket Qin joins the Flink PMC

2019-10-28 Thread bupt_ljy
Congratulations Becket!


Best,
Jiayi Liao


 Original Message 
Sender: Biao Liu
Recipient: dev
Date: Tuesday, Oct 29, 2019 12:10
Subject: Re: [ANNOUNCE] Becket Qin joins the Flink PMC


Congrats Becket! Thanks, Biao /'bɪ.aʊ/ On Tue, 29 Oct 2019 at 12:07, jincheng 
sun  wrote: > Congratulations Becket. > Best, > 
Jincheng > > Rui Li  于2019年10月29日周二 上午11:37写道: > > > 
Congrats Becket! > > > > On Tue, Oct 29, 2019 at 11:20 AM Leonard Xu 
 wrote: > > > > > Congratulations! Becket. > > > > > > Best, 
> > > Leonard Xu > > > > > > > On 2019年10月29日, at 上午11:00, Zhenghua Gao 
 wrote: > > > > > > > > Congratulations, Becket! > > > > > > 
> > *Best Regards,* > > > > *Zhenghua Gao* > > > > > > > > > > > > On Tue, Oct 
29, 2019 at 10:34 AM Yun Gao >  > > > > > > 
wrote: > > > > > > > >> Congratulations Becket! > > > >> > > > >> Best, > > > 
>> Yun > > > >> > > > >> > > > >> 
-- > > > >> 
From:Jingsong Li  > > > >> Send Time:2019 Oct. 29 
(Tue.) 10:23 > > > >> To:dev  > > > >> Subject:Re: 
[ANNOUNCE] Becket Qin joins the Flink PMC > > > >> > > > >> Congratulations 
Becket! > > > >> > > > >> Best, > > > >> Jingsong Lee > > > >> > > > >> On Tue, 
Oct 29, 2019 at 10:18 AM Terry Wang  > > wrote: > > > >> > 
> > >>> Congratulations, Becket! > > > >>> > > > >>> Best, > > > >>> Terry Wang 
> > > >>> > > > >>> > > > >>> > > >  2019年10月29日 10:12,OpenInx 
 写道: > > >  > > >  Congratulations Becket! > > > 
 > > >  On Tue, Oct 29, 2019 at 10:06 AM Zili Chen 
 > > > >> wrote: > > >  > > > > Congratulations 
Becket! > > > > > > > > Best, > > > > tison. > > > > > > > 
> > > > > Congxian Qiu  于2019年10月29日周二 
上午9:53写道: > > > > > > > >> Congratulations Becket! > > > >> > > > 
>> Best, > > > >> Congxian > > > >> > > > >> > > > >> Wei 
Zhong  于2019年10月29日周二 上午9:42写道: > > > >> > > > 
>>> Congratulations Becket! > > > >>> > > > >>> Best, > > > >>> 
Wei > > > >>> > > >  在 2019年10月29日,09:36,Paul Lam 
 写道: > > >  > > >  Congrats Becket! > > 
>  > > >  Best, > > >  Paul Lam > > >  > > > 
> 在 2019年10月29日,02:18,Xingcan Cui  写道: > > > 
> > > > > Congratulations, Becket! > > > > > > > 
> Best, > > > > Xingcan > > > > > > > >> On Oct 
28, 2019, at 1:23 PM, Xuefu Z  > > wrote: > > > >> > 
> > >> Congratulations, Becket! > > > >> > > > >> On 
Mon, Oct 28, 2019 at 10:08 AM Zhu Zhu  > > > > > wrote: 
> > > >> > > > >>> Congratulations Becket! > > > >>> > 
> > >>> Thanks, > > > >>> Zhu Zhu > > > >>> > > > 
>>> Peter Huang  于2019年10月29日周二 > > > >> 
上午1:01写道: > > > >>> > > >  Congratulations Becket Qin! > > 
>  > > >  > > >  Best Regards > > > 
 Peter Huang > > >  > > >  On Mon, Oct 28, 
2019 at 9:19 AM Rong Rong < > > > walter...@gmail.com > > > >>> > > > >>> 
wrote: > > >  > > > > Congratulations Becket!! > > > 
> > > > > -- > > > > Rong > > > 
> > > > > On Mon, Oct 28, 2019, 7:53 AM Jark Wu 
 > > > >> wrote: > > > > > > > >> 
Congratulations Becket! > > > >> > > > >> Best, > > > 
>> Jark > > > >> > > > >> On Mon, 28 Oct 
2019 at 20:26, Benchao Li < > > > >> libenc...@gmail.com> > > > >>> 
wrote: > > > >> > > > >>> Congratulations Becket. > > > 
>>> > > > >>> Dian Fu  
于2019年10月28日周一 > 下午7:22写道: > > > >>> > > >  
Congrats, Becket. > > >  > > > > 在 
2019年10月28日,下午6:07,Fabian Hueske < > fhue...@gmail.com> > > > >> 写道: > > > 
> > > > > Hi everyone, > > > > 
> > > > I'm happy to announce that Becket Qin has joined the > 
> > Flink > > > >> PMC. > > > > Let's congratulate and 
welcome Becket as a new member > > of > > > >> the > > >  Flink > > 
> >> PMC! > > > > > > > > Cheers, > 
> > > Fabian > > >  > > >  > > 
> >>> > > > >>> -- > > > >>> > > > 
>>> Benchao Li > > > >>> School of Electronics 
Engineering and Computer Science, > > > >> Peking > > > > 
University > > > >>> Tel:+86-15650713730 > > > >>> 
Email: libenc...@gmail.com; libenc...@pku.edu.cn > > > >>> > > > 
>> > > > > > > >  > > > >>> > > > 
>> > > > >> > > > >> -- > > > >> Xuefu Zhang 

Re: [DISCUSS] Introduce a location-oriented two-stage query mechanism toimprove the queryable state.

2019-10-24 Thread bupt_ljy
Hi vino,
+1 for improvement on queryable state feature. This reminds me of the 
state-processing-api module, which is very helpful when we analyze state in 
offline. However currently we don’t have many ways to know what is happening 
about the state inside a running application, which makes me feel that this has 
a good potential. Since these two modules are seperate but doing the similar 
work(anaylyzing state), maybe we have to think more about their orientation, or 
maybe integrate them in a graceful way in the future.
Anyway, this is a great work and it’d be better if we can hear more thoughts 
and use cases.


Best Regards,
Jiayi Liao


 Original Message 
Sender: vino yang
Recipient: dev@flink.apache.org
Date: Tuesday, Oct 22, 2019 15:42
Subject: [DISCUSS] Introduce a location-oriented two-stage query mechanism 
toimprove the queryable state.


Hi guys,


Currently, queryable state's client is hard to use. Because it requires users 
to know the address of TaskManager and the port of the proxy. Actually, most 
users who do not have good knowledge about the Flink's inner and runtime in 
production. The queryable state clients directly interact with query state 
client proxies which host on each TaskExecutor. This design requires users to 
know too much detail.
 
We introduce a location service component to improve the architecture of the 
queryable state and hide the details of the task executors. We first give a 
brief introduction to our design in Section 2 and then detail the 
implementation in Section 3. At last, we describe some future work that can be 
done.








I have given an initialized implementation in my Flink repository[2]. One thing 
that needs to be stated is that we have not changed the existing solution, so 
it still works according to the previous modes.



The design documentation is here[3].


Any suggestion and feedback are welcome and appriciated.


[1]: https://statefun.io/
[2]: https://github.com/yanghua/flink/tree/improve-queryable-state-master
[3]: 
https://docs.google.com/document/d/181qYVIiHQGrc3hCj3QBn1iEHF4bUztdw4XO8VSaf_uI/edit?usp=sharing


Best,
Vino

Re: Per Key Grained Watermark Support

2019-09-23 Thread bupt_ljy
Hi Congxian,
Thanks but by doing that, we will lose some features like output of the late 
data. 


 Original Message 
Sender: Congxian Qiu
Recipient: Lasse Nedergaard
Cc: 廖嘉逸; u...@flink.apache.org; 
dev@flink.apache.org
Date: Monday, Sep 23, 2019 19:56
Subject: Re: Per Key Grained Watermark Support


Hi
There was a discussion about this issue[1], as the previous discussion said at 
the moment this is not supported out of the box by Flink, I think you can try 
keyed process function as Lasse said.


[1] 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Watermark-for-each-key-td27485.html#a27516

Best,
Congxian




Lasse Nedergaard  于2019年9月23日周一 下午12:42写道:

Hi Jiayi


We have face the same challenge as we deal with IoT unit and they do not 
necessarily share the same timestamp. Watermark or. Key would be perfect match 
here. We tried to workaround with handle late events as special case with 
sideoutputs but isn’t the perfect solution. 
My conclusion is to skip watermark and create a keyed processed function and 
handle the time for each key my self. 


Med venlig hilsen / Best regards
Lasse Nedergaard



Den 23. sep. 2019 kl. 06.16 skrev 廖嘉逸 :


Hi all,
Currently Watermark can only be supported on task’s level(or partition level), 
which means that the data belonging to the faster key has to share the same 
watermark with the data belonging to the slower key in the same key group of a 
KeyedStream. This will lead to two problems:


1. Latency. For example, every key has its own window state but they have to 
trigger it after the window’s end time is exceeded by the watermark which is 
determined by the data belonging to the slowest key usually. (Same in 
CepOperator and other operators which are using watermark to fire result)
2. States Size. Because the faster key delayes its firing on result, it has to 
store more redundant states which should be pruned earlier.


However, since the watermark has been introduced for a long time and not been 
designed to be more fine-grained in the first place, I find that it’s very hard 
to solve this problem without a big change. I wonder if there is anyone in 
community having some successful experience on this or maybe there is a 
shortcut way? If not, I can try to draft a design if this is needed in 
community.




Best Regards,
Jiayi Liao

Re: [ANNOUNCE] Zili Chen becomes a Flink committer

2019-09-11 Thread bupt_ljy
Congratulations!


Best,
Jiayi Liao


 Original Message 
Sender: Till Rohrmann
Recipient: dev; user
Date: Wednesday, Sep 11, 2019 17:22
Subject: [ANNOUNCE] Zili Chen becomes a Flink committer


Hi everyone,


I'm very happy to announce that Zili Chen (some of you might also know him as 
Tison Kun) accepted the offer of the Flink PMC to become a committer of the 
Flink project.


Zili Chen has been an active community member for almost 16 months now. He 
helped pushing the Flip-6 effort over the finish line, ported a lot of legacy 
code tests, removed a good part of the legacy code, contributed numerous fixes, 
is involved in the Flink's client API refactoring, drives the refactoring of 
Flink's HighAvailabilityServices and much more. Zili Chen also helped the 
community by PR reviews, reporting Flink issues, answering user mails and being 
very active on the dev mailing list.


Congratulations Zili Chen!


Best, Till 

(on behalf of the Flink PMC)

Re: [ANNOUNCE] Jiangjie (Becket) Qin has been added as a committer tothe Flink project

2019-07-18 Thread bupt_ljy
Congratulations !


Best Regards,
Jiayi Liao


Original Message
Sender:Hequn chengchenghe...@gmail.com
Recipient:dev...@flink.apache.org
Date:Thursday, Jul 18, 2019 17:51
Subject:Re: [ANNOUNCE] Jiangjie (Becket) Qin has been added as a committer 
tothe Flink project


Congratulations Becket! Best, Hequn On Thu, Jul 18, 2019 at 5:34 PM vino yang 
yanghua1...@gmail.com wrote:  Congratulations!   Best,  Vino   Yun Gao 
yungao...@aliyun.com.invalid 于2019年7月18日周四 下午5:31写道:Congratulations! 
Best,   Yun   
--   
From:Kostas Kloudas kklou...@gmail.com   Send Time:2019 Jul. 18 (Thu.) 17:30   
To:dev dev@flink.apache.org   Subject:Re: [ANNOUNCE] Jiangjie (Becket) Qin has 
been added as a  committer   to the Flink project Congratulations Becket!   
  Kostas On Thu, Jul 18, 2019 at 11:21 AM Guowei Ma guowei@gmail.com 
wrote:  Congrats Becket!   Best,Guowei  Terry Wang 
zjuwa...@gmail.com 于2019年7月18日周四 下午5:17写道:Congratulations Becket!   
   在 2019年7月18日,下午5:09,Dawid Wysakowicz dwysakow...@apache.org 写道:   
Congratulations Becket! Good to have you onboard!   On 18/07/2019 
10:56, Till Rohrmann wrote:  Congrats Becket!   On Thu, Jul 18, 
2019 at 10:52 AM Jeff Zhang zjf...@gmail.com   wrote:   Congratulations 
Becket!   Xu Forward forwardxu...@gmail.com 于2019年7月18日周四 下午4:39写道: 
  Congratulations Becket! Well deserved.Cheers,   
forward   Kurt Young ykt...@gmail.com 于2019年7月18日周四 下午4:20写道:   
Congrats Becket!   Best,  KurtOn Thu, Jul 18, 2019 
at 4:12 PM JingsongLee lzljs3620...@aliyun.com  .invalid  wrote:
   Congratulations Becket!   Best, Jingsong Lee  
--  
From:Congxian Qiu qcx978132...@gmail.com  Send Time:2019年7月18日(星期四) 16:09   
   To:dev@flink.apache.org dev@flink.apache.org  Subject:Re: [ANNOUNCE] 
Jiangjie (Becket) Qin has been added  as a  committer  to the Flink 
project   Congratulations Becket! Well deserved.   Best,  
CongxianJark Wu imj...@gmail.com 于2019年7月18日周四 下午4:03写道:
   Congratulations Becket! Well deserved.   Cheers,  Jark   
On Thu, 18 Jul 2019 at 15:56, Paul Lam   paullin3...@gmail.com  wrote:  
Congrats Becket!   Best,  Paul Lam   在 
2019年7月18日,15:41,Robert Metzger rmetz...@apache.org  写道:   Hi all,  
 I'm excited to announce that Jiangjie (Becket) Qin just   became  a
  Flink  committer!   Congratulations Becket!   Best,  
Robert (on behalf of the Flink PMC)--  Best Regards 
  Jeff Zhang

Re: Add relative path support in Savepoint Connector

2019-07-17 Thread bupt_ljy
Hi Konstantin,
Thank you for your feedback. You’re right that this part belongs to the 
savepoint desrializing.
This is an old issue which should be resolve before 1.3 version according 
to the comments. Anyway, I’m going to keep following this.


Best Regards,
Jiayi Liao


Original Message
Sender:Konstantin knaufkonstan...@ververica.com
Recipient:dev...@flink.apache.org; Stefan richters.rich...@ververica.com
Date:Wednesday, Jul 17, 2019 16:28
Subject:Re: Add relative path support in Savepoint Connector


Hi Jiayi, I think, this is not an issue with the State Processor API 
specifically, but with savepoints in general. The _metadata file of a savepoint 
uses absolute path references. There is a pretty old Jira ticket, which already 
mentioned this limitation [1]. Stefan (cc) might know more about any ongoing 
development in that direction and might have an idea about the effort of making 
savepoints relocatable. Best, Konstantin [1] 
https://issues.apache.org/jira/browse/FLINK-5763 On Wed, Jul 17, 2019 at 8:35 
AM bupt_ljy bupt_...@163.com wrote:  Hi again,  Anyone has any opinion on this 
topic?Best Regards,  Jiayi LiaoOriginal Message  
Sender:bupt_ljybupt_...@163.com  Recipient:dev...@flink.apache.org  Cc:Tzu-Li 
(Gordon) taitzuli...@apache.org  Date:Tuesday, Jul 16, 2019 15:24  Subject:Add 
relative path support in Savepoint ConnectorHi all,  Firstly I appreciate 
Gordon and Seth’s effort on this feature, which  is really helpful to our 
production use. Like you mentioned in the  FLINK-12047, one of the production 
uses is that we use the existing state  to derive new state. However, since the 
state handle is using the absolute  path to get the input stream, we need to 
directly operate the state in  production environment, which is not an 
anxiety-reducing situation, at  least for me.  So I wonder if we can add the 
relative path support in this module  because the files are persisted in a 
directory after we take a savepoint,  which makes it achievable. I’m not sure 
whether my scenario is a common  case or not, but I think I can give my 
contributions if you all are okay  about this.  Best Regards,  Jiayi Liao 
-- Konstantin Knauf | Solutions Architect +49 160 91394525 Planned Absences: 
10.08.2019 - 31.08.2019, 05.09. - 06.09.2010 -- Ververica GmbH | 
Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at 
Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas 
Tzoumas, Dr. Stephan Ewen

Re: Add relative path support in Savepoint Connector

2019-07-16 Thread bupt_ljy
Hi again,
Anyone has any opinion on this topic?


Best Regards,
Jiayi Liao


Original Message
Sender:bupt_ljybupt_...@163.com
Recipient:dev...@flink.apache.org
Cc:Tzu-Li (Gordon) taitzuli...@apache.org
Date:Tuesday, Jul 16, 2019 15:24
Subject:Add relative path support in Savepoint Connector


Hi all,
Firstly I appreciate Gordon and Seth’s effort on this feature, which is 
really helpful to our production use. Like you mentioned in the FLINK-12047, 
one of the production uses is that we use the existing state to derive new 
state. However, since the state handle is using the absolute path to get the 
input stream, we need to directly operate the state in production environment, 
which is not an anxiety-reducing situation, at least for me.
So I wonder if we can add the relative path support in this module because 
the files are persisted in a directory after we take a savepoint, which makes 
it achievable. I’m not sure whether my scenario is a common case or not, but I 
think I can give my contributions if you all are okay about this.




Best Regards,
Jiayi Liao

Add relative path support in Savepoint Connector

2019-07-16 Thread bupt_ljy
Hi all,
Firstly I appreciate Gordon and Seth’s effort on this feature, which is 
really helpful to our production use. Like you mentioned in the FLINK-12047, 
one of the production uses is that we use the existing state to derive new 
state. However, since the state handle is using the absolute path to get the 
input stream, we need to directly operate the state in production environment, 
which is not an anxiety-reducing situation, at least for me.
So I wonder if we can add the relative path support in this module because 
the files are persisted in a directory after we take a savepoint, which makes 
it achievable. I’m not sure whether my scenario is a common case or not, but I 
think I can give my contributions if you all are okay about this.




Best Regards,
Jiayi Liao

Re: RE: [DISCUSS] Improve Queryable State and introduce aQueryServerProxy component

2019-07-02 Thread bupt_ljy
Hi vino,
Big +1 for this.
Glad to see new progress on this topic! I’ve left some comments on it.


Best Regards,
Jiayi Liao


Original Message
Sender:vino yangyanghua1...@gmail.com
Recipient:Georgi stoyanovgstoya...@live.com
Cc:dev...@flink.apache.org; useru...@flink.apache.org; Stefan 
richters.rich...@ververica.com; Aljoscha krettekaljos...@apache.org; 
kkloudas@gmail.comkklou...@gmail.com; Stephan ewense...@apache.org; 
liyu@apache.orgl...@apache.org; Tzu-Li (Gordon) taitzuli...@apache.org
Date:Tuesday, Jul 2, 2019 16:45
Subject:Re: RE: [DISCUSS] Improve Queryable State and introduce 
aQueryServerProxy component


Hi all,


In the past, I have tried to further refine the design of this topic thread and 
wrote a design document to give more detailed design images and text 
description, so that it is more conducive to discussion.[1]

Note: The document is not yet completed, for example, the "Implementation" 
section is missing. Therefore, it is still in an open discussion state. I will 
improve the rest while listening to the opinions of the community.

Welcome and appreciate more discussions and feedback.



Best,
Vino


[1]:https://docs.google.com/document/d/181qYVIiHQGrc3hCj3QBn1iEHF4bUztdw4XO8VSaf_uI/edit?usp=sharing




yanghua1127 yanghua1...@gmail.com 于2019年6月7日周五 下午11:32写道:

Hi Georgi,

Thanks for your feedback. And glad to hear you are using queryable state.

I agree that implementation of option 1 is easier than others. However, when we 
design the new architecture we need to consider more aspects .e.g. scalability. 
So it seems option 3 is more suitable. Actually, some committers such as 
Stefan, Gordon and Aljoscha have given me feedback and direction.

Currently, I am writing the design document. If it is ready to be presented. I 
will copy to this thread and we can discuss further details.


Best,
Vino



On 2019-06-07 19:03 , Georgi Stoyanov Wrote: 


Hi Vino,

I was investigating the current architecture and AFAIK the first proposal will 
be a lot easier to implement, cause currently JM has the information about the 
states (where, which etc thanks to KvStateLocationRegistry. Correct me if I’m 
wrong)
We are using the feature and it’s indeed not very cool to iterate trough ports, 
check which TM is the responsible one etc etc.

It will be very useful if someone from the committers joins the topic and give 
us some insights what’s going to happen with that feature.


Kind Regards,
Georgi



From: vino yang yanghua1...@gmail.com 
 Sent: Thursday, April 25, 2019 5:18 PM
 To: dev dev@flink.apache.org; user u...@flink.apache.org
 Cc: Stefan Richter s.rich...@ververica.com; Aljoscha Krettek 
aljos...@apache.org; kklou...@gmail.com
 Subject: [DISCUSS] Improve Queryable State and introduce a QueryServerProxy 
component

Hi all,

I want to share my thought with you about improving thequeryable state and 
introducing a QueryServerProxy component.

I think the current queryable state's client is hard to use. Because it needs 
users to know the TaskManager's address and proxy's port. Actually, some 
business users who do not have good knowledge about the Flink's inner or 
runtime in production. However, sometimes they need to query the values of 
states.

IMO, the reason caused this problem is because of the queryable state's 
architecture. Currently, the queryable state clientsinteract with querystate 
client proxy components which host on each TaskManager.This design is difficult 
to encapsulate the point of change and exposes too much detail to the user.

My personal idea is that we could introduce a really queryable state server, 
named e.g.QueryStateProxyServerwhich would delegate all the query state request 
and query the local registry then redirect the request to the specific 
QueryStateClientProxy(runs on each TaskManager). The server is the users really 
want to care about. And it would make the users ignorant to the TaskManagers' 
address and proxies' port. The current QueryStateClientProxy would become 
QueryStateProxyClient.

Generally speaking, the roles of the QueryStateProxyServer list below:

works as all the query client's proxy to receive all the request and send 
response; a router to redirect the real query requests to the specific proxy 
client; maintain route table registry(state - TaskManager, TaskManager-proxy 
client address) more fine-granted control, such as cache result, ACL, TTL, 
SLA(rate limit) and so on
About the implementation, there are three opts:

opt 1:

Let the JobManager acts as the query proxy server.
· pros: reuse the exists JM, do not need to introduce a new process can reduce 
the complexity;
· cons: would make JM heavy burdens, depends on the query frequency, may impact 
on the stability



opt 2:

Introduce a new component which runs as a single process and acts as the query 
proxy server:

· pros: reduce the burdens and make the JM more stability
· cons: introduced a new component will make the implementation more complexity


opt 3 (suggestion comes 

Re: A Question About Flink Network Stack

2019-06-18 Thread bupt_ljy
Hi Zhijiang,


Thank you for the detailed explaination!


Best Regards,
Jiayi Liao




Original Message
Sender:zhijiangwangzhijiang...@aliyun.com.INVALID
Recipient:dev...@flink.apache.org
Date:Tuesday, Jun 18, 2019 17:34
Subject:Re: A Question About Flink Network Stack


Hi Jiayi, Thanks for concerning the network stack and you pointed out a very 
good question. Your understanding is right. In credit-based mode, on receiver 
side it has fixed exclusive buffers(credits) for each remote input channel to 
confirm every channel could receive data in parallel, not block each other. The 
receiver also has a floating shared buffer pool for all the input channels in 
order to give more credits for large backlog on sender side. On sender side it 
still uses a shared buffer pool for all the subpartitions. In one-to-one mode 
which means one producer only produces data for one consumer, then it seems no 
other concerns. In all-to-all mode which means one producer emits data for all 
the consumers, then the buffers in pool might be eventually accumulated into 
the slow subpartition until exhausted, which would cause the other fast 
subpartitions have no available buffers to fill in more data. This would cause 
backpressure finally. Because the operator does not know the condition of 
buffer usage and it could not select which records are emitted in priority. 
Until the record is emitted by producer then we could know which subpartition 
covers this record via ChannelSelector. If we do not serialize this record into 
slow subpartition to occupy buffer resource, then it needs additional memory 
overhead for caching this record, which is not within expectation to cause 
unstable. So on producer side it seems have no other choice until the buffer 
resource is exhausted. The credit-based is not for solving the backpressure 
issue which would not be avoided completely. The credit-based could bring 
obvious benefits for one-to-one mode sharing tcp channel in backpressure 
scenario, and could aovid overhead memory usages in netty stack to casue 
unstable and speed up exactly-once checkpoint for avoiding spilling blocked 
data. In addition, we ever implemeted an improvement for RebalanceStrategy in 
considering the slow subpartition issue. For rebalance channel selector, the 
record could be emitted to any subpartitions actually, no correctness issue. 
Then when the record is emmited, we select the fastest subpartition to take 
this record based on the current backlog size instead of previous round-robin 
way. Then it could bing benefits for some scenarios. Best, Zhijiang 
-- 
From:bupt_ljy bupt_...@163.com Send Time:2019年6月18日(星期二) 16:35 To:dev 
dev@flink.apache.org Subject:A Question About Flink Network Stack Hi all, I’m 
not very familiar with the network part of Flink. And a question occurs to me 
after I read most related source codes about network. I’ve seen that Flink uses 
the credit-based machanism to solve the blocking problem from receivers’ side, 
which means that one “slow” input channel won’t block other input channels’ 
consumption because of their own exclusive credits. However, from the sender’s 
side, I find that memory segments sent to all receivers’ channels share the 
same local segment pool(LocalBufferPool), which may cause a problem here. 
Assume that we have a non-parallel source, which is partitioned into a map 
operator, whose parallelism is two, and one of the map tasks is consuming very 
slow. Is there any possibility that the memory segments which should be sent to 
the slower receiver fill the whole local segment pool, which blocks the data 
which should be sent to the faster receiver? I appreciate any comments or 
answers, and please correct me if I am wrong about this. Best Regards, Jiayi 
Liao

A Question About Flink Network Stack

2019-06-18 Thread bupt_ljy
Hi all,
I’m not very familiar with the network part of Flink. And a question occurs to 
me after I read most related source codes about network. I’ve seen that Flink 
uses the credit-based machanism to solve the blocking problem from receivers’ 
side, which means that one “slow” input channel won’t block other input 
channels’ consumption because of their own exclusive credits.
However, from the sender’s side, I find that memory segments sent to all 
receivers’ channels share the same local segment pool(LocalBufferPool), which 
may cause a problem here. Assume that we have a non-parallel source, which is 
partitioned into a map operator, whose parallelism is two, and one of the map 
tasks is consuming very slow. Is there any possibility that the memory segments 
which should be sent to the slower receiver fill the whole local segment pool, 
which blocks the data which should be sent to the faster receiver?
I appreciate any comments or answers, and please correct me if I am wrong about 
this.




Best Regards,
Jiayi Liao

[jira] [Created] (FLINK-12668) Introduce fromParallelElements for generating DataStreamSource

2019-05-29 Thread bupt_ljy (JIRA)
bupt_ljy created FLINK-12668:


 Summary: Introduce fromParallelElements for generating 
DataStreamSource
 Key: FLINK-12668
 URL: https://issues.apache.org/jira/browse/FLINK-12668
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.8.0
Reporter: bupt_ljy
Assignee: bupt_ljy
 Fix For: 1.9.0


We've already have fromElements function in StreamExecutionEnvironment to 
generate a non-parallel DataStreamSource. We should introduce a similar 
fromParallelElements function because:

1. The current implementations of ParallelSourceFunction are mostly bound to 
external resources like kafka source. And we need a more lightweight parallel 
source function that can be easily created. The SplittableIterator is too heavy 
by the way.
2. It's very useful if we want to verify or test something in a parallel 
processing environment.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Improve Queryable State and introduce aQueryServerProxy component

2019-04-26 Thread bupt_ljy
Hi yang,
 +1 for this proposal. Queryable state is a very common usage in our scenarios 
when we debug and query the realtime status in streaming process like CEP. And 
we’ve done a lot to improve the “user experience” of this feature like exposing 
the taskmanager’s proxy port in TaskManagerInfo.
 I’m looking forward to a more detailed and deeper discussion and I’d like to 
contribute back to the community on this.


Best Regards,
Jiayi Liao


Original Message
Sender:vino yangyanghua1...@gmail.com
Recipient:dev@flink.apache.org...@flink.apache.org
Date:Friday, Apr 26, 2019 16:41
Subject:Re: [DISCUSS] Improve Queryable State and introduce aQueryServerProxy 
component


Hi Paul, Thanks for your reply. You are right, currently, the queryable state 
has few users. And I totally agree with you, it makes the streaming works more 
like a DB. About the architecture and the problem you concern: yes, it maybe 
affect the JobManager if they are deployed together. I think it's important to 
guarantee the JobManager's available and stability, and the QueryProxyServer is 
just a secondary service component. So when describing the role of the 
QueryProxyServer, I mentioned SLA policy, I think it's a solution. But the 
detail may need to be discussed. About starting queryable state client with a 
cmd, I think it's a good idea and valuable. Best, Vino. Paul Lam 
paullin3...@gmail.com 于2019年4月26日周五 下午3:31写道:  Hi Vino,   Thanks a lot for 
bringing up the discussion! Queryable state has been at  beta version for a 
long time, and due to its complexity and instability I  think there are not 
many users, but there’s a great value in it which makes  state as database one 
step closer.   WRT the architecture, I’d vote for opt 3, because it fits the 
cloud  architecture the most and avoids putting more burdens on JM (sometimes 
the  queries could be slow and resources intensive). My concern is that on many 
 cluster frameworks the container resources are limited (IIUC, the JM and QS  
are running in the same container), would JM gets killed if QS eats up too  
much memory?   And a minor suggestion: can we introduce a cmd script to setup a 
 QueryableStateClient? That would be easier for users who wants to try out  
this feature.   Best,  Paul Lam在 2019年4月26日,11:09,vino yang 
yanghua1...@gmail.com 写道: Hi Quan, Thanks for your reply. Actually, 
I did not try this way. But, there are two factors we should consider:  
 1. The local state storage is not equals to RocksDB, otherwise Flink   does 
not need to provide a queryable state client. What's more,  querying   the 
RocksDB is still an address-explicit action.   2. IMO, the proposal's more 
valuable suggestion is to make the  queryable   state's architecture more 
reasonable, let it encapsulated more details  and   improve its scalability.
 Best,   Vino Shi Quan qua...@outlook.com 于2019年4月26日周五 上午10:38写道: 
Hi, How about take states from RocksDB directly, in this case, TM host is   
unnecessary. Best Quan Shi    From: 
vino yang yanghua1...@gmail.com   Sent: Thursday, April 25, 2019 10:18:20 PM   
To: dev; user   Cc: Stefan Richter; Aljoscha Krettek; kklou...@gmail.com   
Subject: [DISCUSS] Improve Queryable State and introduce a   QueryServerProxy 
component Hi all, I want to share my thought with you about improving 
the queryable state   and introducing a QueryServerProxy component. I think 
the current queryable state's client is hard to use. Because it   needs users 
to know the TaskManager's address and proxy's port.  Actually,   some business 
users who do not have good knowledge about the Flink's  inner   or runtime in 
production. However, sometimes they need to query the  values   of states. 
IMO, the reason caused this problem is because of the queryable state's   
architecture. Currently, the queryable state clients interact with query   
state client proxy components which host on each TaskManager. This  design   is 
difficult to encapsulate the point of change and exposes too much  detail   to 
the user. My personal idea is that we could introduce a really queryable 
state   server, named e.g. QueryStateProxyServer which would delegate all the  
query   state request and query the local registry then redirect the request to 
 the   specific QueryStateClientProxy(runs on each TaskManager). The server is  
the   users really want to care about. And it would make the users ignorant to  
 the TaskManagers' address and proxies' port. The current   
QueryStateClientProxy would become QueryStateProxyClient. Generally 
speaking, the roles of the QueryStateProxyServer list below:   * works as 
all the query client's proxy to receive all the request  and   send response;   
* a router to redirect the real query requests to the specific proxy   client;  
 * maintain route table registry (state - TaskManager,   TaskManager-proxy 
client address)   * more fine-grante

CEP - Support for multiple pattern

2018-12-15 Thread bupt_ljy
Hi, all
 It’s actually very common that we construct more than one rule on the same 
data source. And I’m developing some such kind of features for our businesses 
and some ideas come up.


 Do we have any plans for supporting multiple patterns in CEP?


Best,
Jiayi Liao

Re: [DISCUSS] Releasing Flink 1.7.1

2018-12-12 Thread bupt_ljy
Hi Chesnay,
  Thanks for these useful fixes.
  +1 for the release.


Best,
Jiayi Liao


Original Message
Sender:Hequn chengchenghe...@gmail.com
Recipient:dev...@flink.apache.org
Date:Thursday, Dec 13, 2018 09:37
Subject:Re: [DISCUSS] Releasing Flink 1.7.1


Hi Chesnay, Thanks for the efforts. +1 for the release. It's nice to have these 
fixes. Best, Hequn On Wed, Dec 12, 2018 at 11:09 PM Till Rohrmann 
trohrm...@apache.org wrote:  Thanks for starting this discussion Chesnay. +1 
for creating the 1.7.1  release since it already contains very useful fixes.   
Cheers,  Till   On Wed, Dec 12, 2018 at 1:11 PM vino yang yanghua1...@gmail.com 
wrote:Hi Chesnay, +1 to release Flink 1.7.1 Best,   Vino 
Chesnay Schepler ches...@apache.org 于2018年12月12日周三 下午8:04写道:  Hello,   
I propose releasing Flink 1.7.1 before the end of next week.   Some 
critical issue have been identified since 1.7.0, including a statemigration 
issue when migrating from 1.5.3 (FLINK-11087) and a packagingissue in the 
presto-s3-filesystem (FLINK-11085, to be merged later   today).   Given the 
upcoming holidays surrounding Christmas these fixes wouldotherwise be 
delayed for quite a while.   It would also contain a very neat improvement 
with the TableAPI nowbeing usable in the Scala shell (FLINK-9555).   
Regards,   Chesnay

Re: [ANNOUNCE] New committer Gary Yao

2018-09-07 Thread bupt_ljy
Congratulations Gary!


Jiayi


Original Message
Sender:vino yangyanghua1...@gmail.com
Recipient:dev...@flink.apache.org
Date:Saturday, Sep 8, 2018 10:11
Subject:Re: [ANNOUNCE] New committer Gary Yao


Congratulations Gary! Chen Qin qinnc...@gmail.com 于2018年9月8日周六 上午2:07写道:  
Congrats!   ChenOn Sep 7, 2018, at 10:51, Xingcan Cui xingc...@gmail.com 
wrote: Congratulations, Gary! Xingcan On Sep 7, 2018, at 11:20 PM, 
Hequn Cheng chenghe...@gmail.com wrote: Congratulations Gary! Hequn 
On Fri, Sep 7, 2018 at 11:16 PM Matthias J. Sax mj...@apache.org  wrote: 
Congrats! On 09/07/2018 08:15 AM, Timo Walther wrote:   Congratulations, 
Gary! Timo   Am 07.09.18 um 16:46 schrieb Ufuk Celebi:   Great addition 
to the committers. Congrats, Gary! – Ufuk   On Fri, Sep 7, 2018 at 4:45 
PM, Kostas Kloudas   k.klou...@data-artisans.com wrote:   Congratulations Gary! 
Well deserved! Cheers,   Kostas On Sep 7, 2018, at 4:43 PM, Fabian 
Hueske fhue...@gmail.com  wrote: Congratulations Gary! 2018-09-07 16:29 
GMT+02:00 Thomas Weise t...@apache.org: Congrats, Gary! On Fri, Sep 7, 
2018 at 4:17 PM Dawid Wysakowicz   dwysakow...@apache.org   wrote: 
Congratulations Gary! Well deserved! On 07/09/18 16:00, zhangmingleihe 
wrote:   Congrats Gary! Cheers   Minglei 在 2018年9月7日,下午9:59,Andrey 
Zagrebin   and...@data-artisans.com 写道: Congratulations Gary! On 7 Sep 
2018, at 15:45, Stefan Richter   s.rich...@data-artisans.com   wrote:   
Congrats Gary! Am 07.09.2018 um 15:14 schrieb Till Rohrmann   
trohrm...@apache.org   :   Hi everybody, On behalf of the PMC I am 
delighted to announce Gary Yao as a   new   Flink   committer! Gary started 
contributing to the project in June 2017. He  helped   with   the   Flip-6 
implementation, implemented many of the new REST   handlers,   fixed   Mesos 
issues and initiated the Jepsen-based distributed test   suite   which   
uncovered several serious issues. Moreover, he actively helps   community   
members on the mailing list and with PR reviews. Please join me in 
congratulating Gary for becoming a Flink   committer!   Cheers,   Till

Re: [Proposal] Utilities for reading, transforming and creatingStreaming savepoints

2018-08-17 Thread bupt_ljy
Hi,
+1, I think it will be a very great tool for Flink, especially the creating new 
state part. On production, we’re really worried about the availability of the 
savepoints, because the generating logic is inside Flink and we don’t have a 
good way to validate it. But with this tool, we can construct a new state for 
our programs very soon even if the savepoints data is broken.
It’s great, thanks!


Original Message
Sender:Jamie grierjgr...@lyft.com
Recipient:dev...@flink.apache.org
Date:Saturday, Aug 18, 2018 02:32
Subject:Re: [Proposal] Utilities for reading, transforming and 
creatingStreaming savepoints


This is great, Gyula! A colleague here at Lyft has also done some work around 
bootstrapping DataStream programs and we've also talked a bit about doing this 
by running DataSet programs. On Fri, Aug 17, 2018 at 3:28 AM, Gyula Fóra 
gyula.f...@gmail.com wrote:  Hi All!   I want to share with you a little 
project we have been working on at King  (with some help from some dataArtisans 
folks). I think this would be a  valuable addition to Flink and solve a bunch 
of outstanding production  use-cases and headaches around state bootstrapping 
and state analytics.   We have built a quick and dirty POC implementation on 
top of Flink 1.6,  please check the README for some nice examples to get a 
quick idea:   https://github.com/king/bravo   *Short story*  Bravo is a 
convenient state reader and writer library leveraging the  Flink’s batch 
processing capabilities. It supports processing and writing  Flink streaming 
savepoints. At the moment it only supports processing  RocksDB savepoints but 
this can be extended in the future for other state  backends and checkpoint 
types.   Our goal is to cover a few basic features:   - Converting keyed states 
to Flink DataSets for processing and analytics  - Reading/Writing non-keyed 
operators states  - Bootstrap keyed states from Flink DataSets and create new 
valid  savepoints  - Transform existing savepoints by replacing/changing some 
statesSome example use-cases:   - Point-in-time state analytics across all 
operators and keys  - Bootstrap state of a streaming job from external 
resources such as  reading from database/filesystem  - Validate and potentially 
repair corrupted state of a streaming job  - Change max parallelism of a job
Our main goal is to start working together with other Flink production  users 
and make this something useful that can be part of Flink. So if you  have 
use-cases please talk to us :)  I have also started a google doc which contains 
a little bit more info than  the readme and could be a starting place for 
discussions:   
https://docs.google.com/document/d/103k6wPX20kMu5H3SOOXSg5PZIaYpw  
dhqBMr-ppkFL5E/edit?usp=sharing   I know there are a bunch of rough edges and 
bugs (and no tests) but our  motto is: If you are not embarrassed, you released 
too late :)   Please let me know what you think!   Cheers,  Gyula