[jira] [Created] (FLINK-36325) Implement basic restore from checkpoint for ForStStateBackend
Feifan Wang created FLINK-36325: --- Summary: Implement basic restore from checkpoint for ForStStateBackend Key: FLINK-36325 URL: https://issues.apache.org/jira/browse/FLINK-36325 Project: Flink Issue Type: Sub-task Components: Runtime / State Backends Reporter: Feifan Wang As title, implement basic restore from checkpoint for ForStStateBackend, rescale will be implemented later. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re:[VOTE] FLIP-444: Native file copy support
+1 (non binding) —— Best regards, Feifan Wang At 2024-06-25 16:58:22, "Piotr Nowojski" wrote: >Hi all, > >I would like to start a vote for the FLIP-444 [1]. The discussion thread is >here [2]. > >The vote will be open for at least 72. > >Best, >Piotrek > >[1] https://cwiki.apache.org/confluence/x/rAn9EQ >[2] https://lists.apache.org/thread/lkwmyjt2bnmvgx4qpp82rldwmtd4516c
Re:[ANNOUNCE] New Apache Flink Committer - Zhongqiang Gong
Congratulations Zhongqiang ! —— Best regards, Feifan Wang At 2024-06-17 11:20:30, "Leonard Xu" wrote: >Hi everyone, >On behalf of the PMC, I'm happy to announce that Zhongqiang Gong has become a >new Flink Committer! > >Zhongqiang has been an active Flink community member since November 2021, >contributing numerous PRs to both the Flink and Flink CDC repositories. As a >core contributor to Flink CDC, he developed the Oracle and SQL Server CDC >Connectors and managed essential website and CI migrations during the donation >of Flink CDC to Apache Flink. > >Beyond his technical contributions, Zhongqiang actively participates in >discussions on the Flink dev mailing list and responds to threads on the user >and user-zh mailing lists. As an Apache StreamPark (incubating) Committer, he >promotes Flink SQL and Flink CDC technologies at meetups and within the >StreamPark community. > >Please join me in congratulating Zhongqiang Gong for becoming an Apache Flink >committer! > >Best, >Leonard (on behalf of the Flink PMC)
Re:[ANNOUNCE] New Apache Flink Committer - Hang Ruan
Congratulations Hang ! —— Best regards, Feifan Wang At 2024-06-17 11:17:13, "Leonard Xu" wrote: >Hi everyone, >On behalf of the PMC, I'm happy to let you know that Hang Ruan has become a >new Flink Committer ! > >Hang Ruan has been continuously contributing to the Flink project since August >2021. Since then, he has continuously contributed to Flink, Flink CDC, and >various Flink connector repositories, including flink-connector-kafka, >flink-connector-elasticsearch, flink-connector-aws, flink-connector-rabbitmq, >flink-connector-pulsar, and flink-connector-mongodb. Hang Ruan focuses on the >improvements related to connectors and catalogs and initiated FLIP-274. He is >most recognized as a core contributor and maintainer for the Flink CDC >project, contributing many features such as MySQL CDC newly table addition and >the Schema Evolution feature. > >Beyond his technical contributions, Hang Ruan is an active member of the Flink >community. He regularly engages in discussions on the Flink dev mailing list >and the user-zh and user mailing lists, participates in FLIP discussions, >assists with user Q&A, and consistently volunteers for release verifications. > >Please join me in congratulating Hang Ruan for becoming an Apache Flink >committer! > >Best, >Leonard (on behalf of the Flink PMC)
[jira] [Created] (FLINK-35510) Implement basic incremental checkpoint for ForStStateBackend
Feifan Wang created FLINK-35510: --- Summary: Implement basic incremental checkpoint for ForStStateBackend Key: FLINK-35510 URL: https://issues.apache.org/jira/browse/FLINK-35510 Project: Flink Issue Type: New Feature Components: Runtime / State Backends Reporter: Feifan Wang Use low DB api implement a basic incremental checkpoint for ForStStatebackend, follow steps: # db.disableFileDeletions() # db.getLiveFiles(true) # db.entableFileDeletes(false) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35434) Support pass exception in StateExecutor to runtime
Feifan Wang created FLINK-35434: --- Summary: Support pass exception in StateExecutor to runtime Key: FLINK-35434 URL: https://issues.apache.org/jira/browse/FLINK-35434 Project: Flink Issue Type: Sub-task Components: Runtime / State Backends Reporter: Feifan Wang Exception may thrown when _StateExecutor_ execute the state request , such as a IOException. We should pass the exception to runtime then failed the job in this situation. _InternalStateFuture#completeExceptionally()_ will be added as [discussion here|https://github.com/apache/flink/pull/24739#discussion_r1590633134]. And then, _ForStWriteBatchOperation_ and _ForStGeneralMultiGetOperation_ will call this method when exception occurred. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re:[ANNOUNCE] New Apache Flink PMC Member - Lincoln Lee
Congratulations Lincoln ! —— Best regards, Feifan Wang At 2024-04-12 15:59:00, "Jark Wu" wrote: >Hi everyone, > >On behalf of the PMC, I'm very happy to announce that Lincoln Lee has >joined the Flink PMC! > >Lincoln has been an active member of the Apache Flink community for >many years. He mainly works on Flink SQL component and has driven >/pushed many FLIPs around SQL, including FLIP-282/373/415/435 in >the recent versions. He has a great technical vision of Flink SQL and >participated in plenty of discussions in the dev mailing list. Besides >that, >he is community-minded, such as being the release manager of 1.19, >verifying releases, managing release syncs, writing the release >announcement etc. > >Congratulations and welcome Lincoln! > >Best, >Jark (on behalf of the Flink PMC)
Re:[ANNOUNCE] New Apache Flink PMC Member - Jing Ge
Congratulations, Jing! —— Best regards, Feifan Wang At 2024-04-12 16:02:01, "Jark Wu" wrote: >Hi everyone, > >On behalf of the PMC, I'm very happy to announce that Jing Ge has >joined the Flink PMC! > >Jing has been contributing to Apache Flink for a long time. He continuously >works on SQL, connectors, Source, and Sink APIs, test, and document >modules while contributing lots of code and insightful discussions. He is >one of the maintainers of Flink CI infra. He is also willing to help a lot >in the >community work, such as being the release manager for both 1.18 and 1.19, >verifying releases, and answering questions on the mailing list. Besides >that, >he is continuously helping with the expansion of the Flink community and >has >given several talks about Flink at many conferences, such as Flink Forward >2022 and 2023. > >Congratulations and welcome Jing! > >Best, >Jark (on behalf of the Flink PMC)
Re:[ANNOUNCE] New Apache Flink Committer - Zakelly Lan
Congratulations, Zakelly!—— Best regards, Feifan Wang At 2024-04-15 10:50:06, "Yuan Mei" wrote: >Hi everyone, > >On behalf of the PMC, I'm happy to let you know that Zakelly Lan has become >a new Flink Committer! > >Zakelly has been continuously contributing to the Flink project since 2020, >with a focus area on Checkpointing, State as well as frocksdb (the default >on-disk state db). > >He leads several FLIPs to improve checkpoints and state APIs, including >File Merging for Checkpoints and configuration/API reorganizations. He is >also one of the main contributors to the recent efforts of "disaggregated >state management for Flink 2.0" and drives the entire discussion in the >mailing thread, demonstrating outstanding technical depth and breadth of >knowledge. > >Beyond his technical contributions, Zakelly is passionate about helping the >community in numerous ways. He spent quite some time setting up the Flink >Speed Center and rebuilding the benchmark pipeline after the original one >was out of lease. He helps build frocksdb and tests for the upcoming >frocksdb release (bump rocksdb from 6.20.3->8.10). > >Please join me in congratulating Zakelly for becoming an Apache Flink >committer! > >Best, >Yuan (on behalf of the Flink PMC)
Re:[ANNOUNCE] New Apache Flink PMC Member - Lincoln Lee
Congratulations, Lincoln! —— Best regards, Feifan Wang At 2024-04-12 15:59:00, "Jark Wu" wrote: >Hi everyone, > >On behalf of the PMC, I'm very happy to announce that Lincoln Lee has >joined the Flink PMC! > >Lincoln has been an active member of the Apache Flink community for >many years. He mainly works on Flink SQL component and has driven >/pushed many FLIPs around SQL, including FLIP-282/373/415/435 in >the recent versions. He has a great technical vision of Flink SQL and >participated in plenty of discussions in the dev mailing list. Besides >that, >he is community-minded, such as being the release manager of 1.19, >verifying releases, managing release syncs, writing the release >announcement etc. > >Congratulations and welcome Lincoln! > >Best, >Jark (on behalf of the Flink PMC)
Re:[VOTE] FLIP-441: Show the JobType and remove Execution Mode on Flink WebUI
+1 (non-binding) -- —— Best regards, Feifan Wang At 2024-04-10 12:36:00, "Rui Fan" <1996fan...@gmail.com> wrote: >Hi devs, > >Thank you to everyone for the feedback on FLIP-441: Show >the JobType and remove Execution Mode on Flink WebUI[1] >which has been discussed in this thread [2]. > >I would like to start a vote for it. The vote will be open for at least 72 >hours unless there is an objection or not enough votes. > >[1] https://cwiki.apache.org/confluence/x/agrPEQ >[2] https://lists.apache.org/thread/0s52w17w24x7m2zo6ogl18t1fy412vcd > >Best, >Rui
Re:Re: [VOTE] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State
+1 (non-binding) —— Best regards, Feifan Wang At 2024-03-28 18:43:36, "Yuan Mei" wrote: >+1 (binding) > >Best, >Yuan > >On Wed, Mar 27, 2024 at 7:31 PM Jinzhong Li >wrote: > >> Hi devs, >> >> >> I'd like to start a vote on the FLIP-428: Fault Tolerance/Rescale >> Integration for Disaggregated State [1]. The discussion thread is here [2]. >> >> >> The vote will be open for at least 72 hours unless there is an objection or >> insufficient votes. >> >> [1] https://cwiki.apache.org/confluence/x/UYp3EQ >> >> [2] https://lists.apache.org/thread/vr8f91p715ct4lop6b3nr0fh4z5p312b >> >> Best, >> >> Jinzhong >>
Re:Re: [VOTE] FLIP-426: Grouping Remote State Access
+1 (non-binding) —— Best regards, Feifan Wang At 2024-03-28 18:43:12, "Yuan Mei" wrote: >+1 (binding) > >Best, >Yuan > >On Wed, Mar 27, 2024 at 6:56 PM Jinzhong Li >wrote: > >> Hi devs, >> >> I'd like to start a vote on the FLIP-426: Grouping Remote State Access [1]. >> The discussion thread is here [2]. >> >> The vote will be open for at least 72 hours unless there is an objection or >> insufficient votes. >> >> >> [1] https://cwiki.apache.org/confluence/x/TYp3EQ >> >> [2] https://lists.apache.org/thread/bt931focfl9971cwq194trmf3pkdsxrf >> >> >> Best, >> >> Jinzhong >>
Re:Re: [VOTE] FLIP-427: Disaggregated state Store
+1 (non-binding) —— Best regards, Feifan Wang At 2024-03-28 19:01:01, "Yuan Mei" wrote: >+1 (binding) > >Best >Yuan > > > > >On Wed, Mar 27, 2024 at 6:37 PM Hangxiang Yu wrote: > >> Hi devs, >> >> Thanks all for your valuable feedback about FLIP-427: Disaggregated state >> Store [1]. >> I'd like to start a vote on it. The discussion thread is here [2]. >> >> The vote will be open for at least 72 hours unless there is an objection or >> insufficient votes. >> >> [1] https://cwiki.apache.org/confluence/x/T4p3EQ >> [2] https://lists.apache.org/thread/vktfzqvb7t4rltg7fdlsyd9sfdmrc4ft >> >> >> Best, >> Hangxiang >>
Re:Re: [VOTE] FLIP-425: Asynchronous Execution Model
+1 (non-binding) —— Best regards, Feifan Wang At 2024-03-28 20:20:39, "Piotr Nowojski" wrote: >+1 (binding) > >Piotrek > >czw., 28 mar 2024 o 11:44 Yuan Mei napisał(a): > >> +1 (binding) >> >> Best, >> Yuan >> >> On Thu, Mar 28, 2024 at 4:33 PM Xuannan Su wrote: >> >> > +1 (non-binding) >> > >> > Best regards, >> > Xuannan >> > >> > On Wed, Mar 27, 2024 at 6:28 PM Yanfei Lei wrote: >> > > >> > > Hi everyone, >> > > >> > > Thanks for all the feedback about the FLIP-425: Asynchronous Execution >> > > Model [1]. The discussion thread is here [2]. >> > > >> > > The vote will be open for at least 72 hours unless there is an >> > > objection or insufficient votes. >> > > >> > > [1] https://cwiki.apache.org/confluence/x/S4p3EQ >> > > [2] https://lists.apache.org/thread/wxn1j848fnfkqsnrs947wh1wmj8n8z0h >> > > >> > > Best regards, >> > > Yanfei >> > >>
Re:Re: [VOTE] FLIP-424: Asynchronous State APIs
+1 (non-binding) —— Best regards, Feifan Wang At 2024-03-28 20:21:50, "Jing Ge" wrote: >+1 (binding) > >Thanks! > >Best regards, >Jing > >On Wed, Mar 27, 2024 at 11:23 AM Zakelly Lan wrote: > >> Hi devs, >> >> I'd like to start a vote on the FLIP-424: Asynchronous State APIs [1]. The >> discussion thread is here [2]. >> >> The vote will be open for at least 72 hours unless there is an objection or >> insufficient votes. >> >> [1] https://cwiki.apache.org/confluence/x/SYp3EQ >> [2] https://lists.apache.org/thread/nmd9qd0k8l94ygcfgllxms49wmtz1864 >> >> >> Best, >> Zakelly >>
Re:Re: [VOTE] FLIP-423: Disaggregated State Storage and Management (Umbrella FLIP)
+1 (non-binding) —— Best regards, Feifan Wang At 2024-03-28 20:20:21, "Piotr Nowojski" wrote: >+1 (binding) > >Piotrek > >czw., 28 mar 2024 o 11:43 Yuan Mei napisał(a): > >> Hi devs, >> >> I'd like to start a vote on the FLIP-423: Disaggregated State Storage and >> Management (Umbrella FLIP) [1]. The discussion thread is here [2]. >> >> The vote will be open for at least 72 hours unless there is an objection or >> insufficient votes. >> >> [1] https://cwiki.apache.org/confluence/x/R4p3EQ >> [2] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0 >> >> >> Best, >> Yuan >>
Re:Re: [ANNOUNCE] Apache Paimon is graduated to Top Level Project
Congratulations!—— Best regards, Feifan Wang At 2024-03-28 20:02:43, "Yanfei Lei" wrote: >Congratulations! > >Best, >Yanfei > >Zhanghao Chen 于2024年3月28日周四 19:59写道: >> >> Congratulations! >> >> Best, >> Zhanghao Chen >> >> From: Yu Li >> Sent: Thursday, March 28, 2024 15:55 >> To: d...@paimon.apache.org >> Cc: dev ; user >> Subject: Re: [ANNOUNCE] Apache Paimon is graduated to Top Level Project >> >> CC the Flink user and dev mailing list. >> >> Paimon originated within the Flink community, initially known as Flink >> Table Store, and all our incubating mentors are members of the Flink >> Project Management Committee. I am confident that the bonds of >> enduring friendship and close collaboration will continue to unite the >> two communities. >> >> And congratulations all! >> >> Best Regards, >> Yu >> >> On Wed, 27 Mar 2024 at 20:35, Guojun Li wrote: >> > >> > Congratulations! >> > >> > Best, >> > Guojun >> > >> > On Wed, Mar 27, 2024 at 5:24 PM wulin wrote: >> > >> > > Congratulations~ >> > > >> > > > 2024年3月27日 15:54,王刚 写道: >> > > > >> > > > Congratulations~ >> > > > >> > > >> 2024年3月26日 10:25,Jingsong Li 写道: >> > > >> >> > > >> Hi Paimon community, >> > > >> >> > > >> I’m glad to announce that the ASF board has approved a resolution to >> > > >> graduate Paimon into a full Top Level Project. Thanks to everyone for >> > > >> your help to get to this point. >> > > >> >> > > >> I just created an issue to track the things we need to modify [2], >> > > >> please comment on it if you feel that something is missing. You can >> > > >> refer to apache documentation [1] too. >> > > >> >> > > >> And, we already completed the GitHub repo migration [3], please update >> > > >> your local git repo to track the new repo [4]. >> > > >> >> > > >> You can run the following command to complete the remote repo tracking >> > > >> migration. >> > > >> >> > > >> git remote set-url origin https://github.com/apache/paimon.git >> > > >> >> > > >> If you have a different name, please change the 'origin' to your >> > > >> remote >> > > name. >> > > >> >> > > >> Please join me in celebrating! >> > > >> >> > > >> [1] >> > > https://incubator.apache.org/guides/transferring.html#life_after_graduation >> > > >> [2] https://github.com/apache/paimon/issues/3091 >> > > >> [3] https://issues.apache.org/jira/browse/INFRA-25630 >> > > >> [4] https://github.com/apache/paimon >> > > >> >> > > >> Best, >> > > >> Jingsong Lee >> > > >> > >
Re: [DISCUSS] FLIP-427: Disaggregated State Store
Thanks for your reply, Hangxiang. I totally agree with you about the jni part. Hi Yun Tang, I just noticed that FLIP-427 mentions “The life cycle of working dir is managed as before local strategy.” IIUC, the working dir will be deleted after TaskManager exit. And I think that's enough for current stage, WDYT ? —— Best regards, Feifan Wang At 2024-03-28 12:18:56, "Hangxiang Yu" wrote: >Hi, Feifan. > >Thanks for your reply. > >What if we only use jni to access DFS that needs to reuse Flink FileSystem? >> And all local disk access through native api. This idea is based on the >> understanding that jni overhead is not worth mentioning compared to DFS >> access latency. It might make more sense to consider avoiding jni overhead >> for faster local disks. Since local disk as secondary is already under >> consideration [1], maybe we can discuss in that FLIP whether to use native >> api to access local disk? >> >This is a good suggestion. It's reasonable to use native api to access >local disk cache since it requires lower latency compared to remote access. >I also believe that the jni overhead is relatively negligible when weighed >against the latency of remote I/O as mentioned in the FLIP. >So I think we could just go on proposal 2 and keep proposal 1 as a >potential future optimization, which could work better when there is a >higher performance requirement or some native libraries of filesystems have >significantly higher performance and resource usage compared to their java >libs. > > >On Thu, Mar 28, 2024 at 11:39 AM Feifan Wang wrote: > >> Thanks for this valuable proposal Hangxiang ! >> >> >> > If we need to introduce a JNI call during each filesystem call, that >> would be N times JNI cost compared with the current RocksDB state-backend's >> JNI cost. >> What if we only use jni to access DFS that needs to reuse Flink >> FileSystem? And all local disk access through native api. This idea is >> based on the understanding that jni overhead is not worth mentioning >> compared to DFS access latency. It might make more sense to consider >> avoiding jni overhead for faster local disks. Since local disk as secondary >> is already under consideration [1], maybe we can discuss in that FLIP >> whether to use native api to access local disk? >> >> >> >I'd suggest keeping `state.backend.forSt.working-dir` as it is for now. >> >Different disaggregated state storages may have their own semantics about >> >this configuration, e.g. life cycle, supported file systems or storages. >> I agree with considering moving this configuration up to the engine level >> until there are other disaggreated backends. >> >> >> [1] https://cwiki.apache.org/confluence/x/U4p3EQ >> >> —— >> >> Best regards, >> >> Feifan Wang >> >> >> >> >> At 2024-03-28 09:55:48, "Hangxiang Yu" wrote: >> >Hi, Yun. >> >Thanks for the reply. >> > >> >The JNI cost you considered is right. As replied to Yue, I agreed to leave >> >space and consider proposal 1 as an optimization in the future, which is >> >also updated in the FLIP. >> > >> >The other question is that the configuration of >> >> `state.backend.forSt.working-dir` looks too coupled with the ForSt >> >> state-backend, how would it be if we introduce another disaggregated >> state >> >> storage? Thus, I think `state.backend.disaggregated.working-dir` might >> be a >> >> better configuration name. >> > >> >I'd suggest keeping `state.backend.forSt.working-dir` as it is for now. >> >Different disaggregated state storages may have their own semantics about >> >this configuration, e.g. life cycle, supported file systems or storages. >> >Maybe it's more suitable to consider it together when we introduce other >> >disaggregated state storages in the future. >> > >> >On Thu, Mar 28, 2024 at 12:02 AM Yun Tang wrote: >> > >> >> Hi Hangxiang, >> >> >> >> The design looks good, and I also support leaving space for proposal 1. >> >> >> >> As you know, loading index/filter/data blocks for querying across levels >> >> would introduce high IO access within the LSM tree for old data. If we >> need >> >> to introduce a JNI call during each filesystem call, that would be N >> times >> >> JNI cost compared with the current RocksDB state-backend's JNI cost. >> >> >> >> The other ques
Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State
And I think the cleanup of working dir should be discussion in FLIP-427[1] ( this mail list [2]) ? [1] https://cwiki.apache.org/confluence/x/T4p3EQ [2] https://lists.apache.org/thread/vktfzqvb7t4rltg7fdlsyd9sfdmrc4ft —— Best regards, Feifan Wang At 2024-03-28 11:56:22, "Feifan Wang" wrote: >Hi Jinzhong : > > >> I suggest that we could postpone this topic for now and consider it >> comprehensively combined with the TM ownership file management in the future >> FLIP. > > >Sorry I still think we should consider the cleanup of the working dir in this >FLIP, although we may come up with a better solution in a subsequent flip, I >think it is important to maintain the integrity of the current changes. >Otherwise we may suffer from wasted DFS space for some time. >Perhaps we only need a simple cleanup strategy at this stage, such as >proactive cleanup when TM exits. While this may fail in the case of a TM >crash, it already alleviates the problem. > > > > >—— > >Best regards, > >Feifan Wang > > > > >At 2024-03-28 11:15:11, "Jinzhong Li" wrote: >>Hi Yun, >> >>Thanks for your reply. >> >>> 1. Why must we have another 'subTask-checkpoint-sub-dir' >>> under the shared directory? if we don't consider making >>> TM ownership in this FLIP, this design seems unnecessary. >> >> Good catch! We will not change the directory layout of shared directory in >>this FLIP. I have already removed this part from this FLIP. I think we >>could revisit this topic in a future FLIP about TM ownership. >> >>> 2. This FLIP forgets to mention the cleanup of the remote >>> working directory in case of the taskmanager crushes, >>> even though this is an open problem, we can still leave >>> some space for future optimization. >> >>Considering that we have plans to merge TM working dir and checkpoint dir >>into one directory, I suggest that we could postpone this topic for now and >>consider it comprehensively combined with the TM ownership file management >>in the future FLIP. >> >>Best, >>Jinzhong >> >> >> >>On Wed, Mar 27, 2024 at 11:49 PM Yun Tang wrote: >> >>> Hi Jinzhong, >>> >>> The overall design looks good. >>> >>> I have two minor questions: >>> >>> 1. Why must we have another 'subTask-checkpoint-sub-dir' under the shared >>> directory? if we don't consider making TM ownership in this FLIP, this >>> design seems unnecessary. >>> 2. This FLIP forgets to mention the cleanup of the remote working >>> directory in case of the taskmanager crushes, even though this is an open >>> problem, we can still leave some space for future optimization. >>> >>> Best, >>> Yun Tang >>> >>> >>> From: Jinzhong Li >>> Sent: Monday, March 25, 2024 10:41 >>> To: dev@flink.apache.org >>> Subject: Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for >>> Disaggregated State >>> >>> Hi Yue, >>> >>> Thanks for your comments. >>> >>> The CURRENT is a special file that points to the latest manifest log >>> file. As Zakelly explained above, we could record the latest manifest >>> filename during sync phase, and write the filename into CURRENT snapshot >>> file during async phase. >>> >>> Best, >>> Jinzhong >>> >>> On Fri, Mar 22, 2024 at 11:16 PM Zakelly Lan >>> wrote: >>> >>> > Hi Yue, >>> > >>> > Thanks for bringing this up! >>> > >>> > The CURRENT FILE is the special one, which should be snapshot during the >>> > sync phase (temporary load into memory). Thus we can solve this. >>> > >>> > >>> > Best, >>> > Zakelly >>> > >>> > On Fri, Mar 22, 2024 at 4:55 PM yue ma wrote: >>> > >>> > > Hi jinzhong, >>> > > Thanks for you reply. I still have some doubts about the first >>> question. >>> > Is >>> > > there such a case >>> > > When you made a snapshot during the synchronization phase, you recorded >>> > the >>> > > current and manifest 8, but before asynchronous phase, the manifest >>> > reached >>> > > the size threshold and then the CURRENT FILE pointed to the new >>> manifest >>> > 9, >>>
Re:Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State
Hi Jinzhong : > I suggest that we could postpone this topic for now and consider it > comprehensively combined with the TM ownership file management in the future > FLIP. Sorry I still think we should consider the cleanup of the working dir in this FLIP, although we may come up with a better solution in a subsequent flip, I think it is important to maintain the integrity of the current changes. Otherwise we may suffer from wasted DFS space for some time. Perhaps we only need a simple cleanup strategy at this stage, such as proactive cleanup when TM exits. While this may fail in the case of a TM crash, it already alleviates the problem. —— Best regards, Feifan Wang At 2024-03-28 11:15:11, "Jinzhong Li" wrote: >Hi Yun, > >Thanks for your reply. > >> 1. Why must we have another 'subTask-checkpoint-sub-dir' >> under the shared directory? if we don't consider making >> TM ownership in this FLIP, this design seems unnecessary. > > Good catch! We will not change the directory layout of shared directory in >this FLIP. I have already removed this part from this FLIP. I think we >could revisit this topic in a future FLIP about TM ownership. > >> 2. This FLIP forgets to mention the cleanup of the remote >> working directory in case of the taskmanager crushes, >> even though this is an open problem, we can still leave >> some space for future optimization. > >Considering that we have plans to merge TM working dir and checkpoint dir >into one directory, I suggest that we could postpone this topic for now and >consider it comprehensively combined with the TM ownership file management >in the future FLIP. > >Best, >Jinzhong > > > >On Wed, Mar 27, 2024 at 11:49 PM Yun Tang wrote: > >> Hi Jinzhong, >> >> The overall design looks good. >> >> I have two minor questions: >> >> 1. Why must we have another 'subTask-checkpoint-sub-dir' under the shared >> directory? if we don't consider making TM ownership in this FLIP, this >> design seems unnecessary. >> 2. This FLIP forgets to mention the cleanup of the remote working >> directory in case of the taskmanager crushes, even though this is an open >> problem, we can still leave some space for future optimization. >> >> Best, >> Yun Tang >> >> >> From: Jinzhong Li >> Sent: Monday, March 25, 2024 10:41 >> To: dev@flink.apache.org >> Subject: Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for >> Disaggregated State >> >> Hi Yue, >> >> Thanks for your comments. >> >> The CURRENT is a special file that points to the latest manifest log >> file. As Zakelly explained above, we could record the latest manifest >> filename during sync phase, and write the filename into CURRENT snapshot >> file during async phase. >> >> Best, >> Jinzhong >> >> On Fri, Mar 22, 2024 at 11:16 PM Zakelly Lan >> wrote: >> >> > Hi Yue, >> > >> > Thanks for bringing this up! >> > >> > The CURRENT FILE is the special one, which should be snapshot during the >> > sync phase (temporary load into memory). Thus we can solve this. >> > >> > >> > Best, >> > Zakelly >> > >> > On Fri, Mar 22, 2024 at 4:55 PM yue ma wrote: >> > >> > > Hi jinzhong, >> > > Thanks for you reply. I still have some doubts about the first >> question. >> > Is >> > > there such a case >> > > When you made a snapshot during the synchronization phase, you recorded >> > the >> > > current and manifest 8, but before asynchronous phase, the manifest >> > reached >> > > the size threshold and then the CURRENT FILE pointed to the new >> manifest >> > 9, >> > > and then uploaded the incorrect CURRENT file ? >> > > >> > > Jinzhong Li 于2024年3月20日周三 20:13写道: >> > > >> > > > Hi Yue, >> > > > >> > > > Thanks for your feedback! >> > > > >> > > > > 1. If we choose Option-3 for ForSt , how would we handle Manifest >> > File >> > > > > ? Should we take a snapshot of the Manifest during the >> > synchronization >> > > > phase? >> > > > >> > > > IIUC, the GetLiveFiles() API in Option-3 can also catch the fileInfo >> of >> > > > Manifest files, and this api also return the manifest file size, >> which >> > > > means this ap
Re:Re: [DISCUSS] FLIP-427: Disaggregated State Store
Thanks for this valuable proposal Hangxiang ! > If we need to introduce a JNI call during each filesystem call, that would be > N times JNI cost compared with the current RocksDB state-backend's JNI cost. What if we only use jni to access DFS that needs to reuse Flink FileSystem? And all local disk access through native api. This idea is based on the understanding that jni overhead is not worth mentioning compared to DFS access latency. It might make more sense to consider avoiding jni overhead for faster local disks. Since local disk as secondary is already under consideration [1], maybe we can discuss in that FLIP whether to use native api to access local disk? >I'd suggest keeping `state.backend.forSt.working-dir` as it is for now. >Different disaggregated state storages may have their own semantics about >this configuration, e.g. life cycle, supported file systems or storages. I agree with considering moving this configuration up to the engine level until there are other disaggreated backends. [1] https://cwiki.apache.org/confluence/x/U4p3EQ —————— Best regards, Feifan Wang At 2024-03-28 09:55:48, "Hangxiang Yu" wrote: >Hi, Yun. >Thanks for the reply. > >The JNI cost you considered is right. As replied to Yue, I agreed to leave >space and consider proposal 1 as an optimization in the future, which is >also updated in the FLIP. > >The other question is that the configuration of >> `state.backend.forSt.working-dir` looks too coupled with the ForSt >> state-backend, how would it be if we introduce another disaggregated state >> storage? Thus, I think `state.backend.disaggregated.working-dir` might be a >> better configuration name. > >I'd suggest keeping `state.backend.forSt.working-dir` as it is for now. >Different disaggregated state storages may have their own semantics about >this configuration, e.g. life cycle, supported file systems or storages. >Maybe it's more suitable to consider it together when we introduce other >disaggregated state storages in the future. > >On Thu, Mar 28, 2024 at 12:02 AM Yun Tang wrote: > >> Hi Hangxiang, >> >> The design looks good, and I also support leaving space for proposal 1. >> >> As you know, loading index/filter/data blocks for querying across levels >> would introduce high IO access within the LSM tree for old data. If we need >> to introduce a JNI call during each filesystem call, that would be N times >> JNI cost compared with the current RocksDB state-backend's JNI cost. >> >> The other question is that the configuration of >> `state.backend.forSt.working-dir` looks too coupled with the ForSt >> state-backend, how would it be if we introduce another disaggregated state >> storage? Thus, I think `state.backend.disaggregated.working-dir` might be a >> better configuration name. >> >> >> Best >> Yun Tang >> >> >> From: Hangxiang Yu >> Sent: Wednesday, March 20, 2024 11:32 >> To: dev@flink.apache.org >> Subject: Re: [DISCUSS] FLIP-427: Disaggregated State Store >> >> Hi, Yue. >> Thanks for the reply. >> >> If we use proposal1, we can easily reuse these optimizations .It is even >> > possible to discuss and review the solution together in the Rocksdb >> > community. >> >> We also saw these useful optimizations which could be applied to ForSt in >> the future. >> But IIUC, it's not binding to proposal 1, right? We could also >> implement interfaces about temperature and secondary cache to reuse them, >> or organize a more complex HybridEnv based on proposal 2. >> >> My point is whether we should retain the potential of proposal 1 in the >> > design. >> > >> This is a good suggestion. We choose proposal 2 firstly due to its >> maintainability and scalability, especially because it could leverage all >> filesystems flink supported conveniently. >> Given the indelible advantage in performance, I think we could also >> consider proposal 1 as an optimization in the future. >> For the interface on the DB side, we could also expose more different Envs >> in the future. >> >> >> On Tue, Mar 19, 2024 at 9:14 PM yue ma wrote: >> >> > Hi Hangxiang, >> > >> > Thanks for bringing this discussion. >> > I have a few questions about the Proposal you mentioned in the FLIP. >> > >> > The current conclusion is to use proposal 2, which is okay for me. My >> point >> > is whether we should retain the potential of proposal 1 in the design. >> > There are the following reasons: >> > 1.
Re:Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations ! 在 2024-03-22 12:04:39,"Hangxiang Yu" 写道: >Congratulations! >Thanks for the efforts. > >On Fri, Mar 22, 2024 at 10:00 AM Yanfei Lei wrote: > >> Congratulations! >> >> Best regards, >> Yanfei >> >> Xuannan Su 于2024年3月22日周五 09:21写道: >> > >> > Congratulations! >> > >> > Best regards, >> > Xuannan >> > >> > On Fri, Mar 22, 2024 at 9:17 AM Charles Zhang >> wrote: >> > > >> > > Congratulations! >> > > >> > > Best wishes, >> > > Charles Zhang >> > > from Apache InLong >> > > >> > > >> > > Jeyhun Karimov 于2024年3月22日周五 04:16写道: >> > > >> > > > Great news! Congratulations! >> > > > >> > > > Regards, >> > > > Jeyhun >> > > > >> > > > On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan >> wrote: >> > > > >> > > > > Congratulations! Thanks for the efforts. >> > > > > >> > > > > >> > > > > Best, >> > > > > Yuxin >> > > > > >> > > > > >> > > > > Samrat Deb 于2024年3月21日周四 20:28写道: >> > > > > >> > > > > > Congratulations ! >> > > > > > >> > > > > > Bests >> > > > > > Samrat >> > > > > > >> > > > > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy < >> hamdy10...@gmail.com> >> > > > > wrote: >> > > > > > >> > > > > > > Congratulations, great work and great news. >> > > > > > > Best Regards >> > > > > > > Ahmed Hamdy >> > > > > > > >> > > > > > > >> > > > > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li > > >> > > > wrote: >> > > > > > > >> > > > > > > > Congratulations, and thanks for the great work! >> > > > > > > > >> > > > > > > > Yuan Mei 于2024年3月21日周四 18:31写道: >> > > > > > > > > >> > > > > > > > > Thanks for driving these efforts! >> > > > > > > > > >> > > > > > > > > Congratulations >> > > > > > > > > >> > > > > > > > > Best >> > > > > > > > > Yuan >> > > > > > > > > >> > > > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li >> wrote: >> > > > > > > > > >> > > > > > > > > > Congratulations and look forward to its further >> development! >> > > > > > > > > > >> > > > > > > > > > Best Regards, >> > > > > > > > > > Yu >> > > > > > > > > > >> > > > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam < >> jam.gz...@gmail.com> >> > > > > > wrote: >> > > > > > > > > > > >> > > > > > > > > > > Congrattulations! >> > > > > > > > > > > >> > > > > > > > > > > Leonard Xu 于2024年3月20日周三 21:36写道: >> > > > > > > > > > > >> > > > > > > > > > > > Hi devs and users, >> > > > > > > > > > > > >> > > > > > > > > > > > We are thrilled to announce that the donation of >> Flink CDC >> > > > > as a >> > > > > > > > > > > > sub-project of Apache Flink has completed. We invite >> you to >> > > > > > > explore >> > > > > > > > > > the new >> > > > > > > > > > > > resources available: >> > > > > > > > > > > > >> > > > > > > > > > > > - GitHub Repository: >> https://github.com/apache/flink-cdc >> > > > > > > > > > > > - Flink CDC Documentation: >> > > > > > > > > > > > >> https://nightlies.apache.org/flink/flink-cdc-docs-stable >> > > > > > > > > > > > >> > > > > > > > > > > > After Flink community accepted this donation[1], we >> have >> > > > > > > completed >> > > > > > > > > > > > software copyright signing, code repo migration, code >> > > > > cleanup, >> > > > > > > > website >> > > > > > > > > > > > migration, CI migration and github issues migration >> etc. >> > > > > > > > > > > > Here I am particularly grateful to Hang Ruan, >> Zhongqaing >> > > > > Gong, >> > > > > > > > > > Qingsheng >> > > > > > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other >> > > > > > contributors >> > > > > > > > for >> > > > > > > > > > their >> > > > > > > > > > > > contributions and help during this process! >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > For all previous contributors: The contribution >> process has >> > > > > > > > slightly >> > > > > > > > > > > > changed to align with the main Flink project. To >> report >> > > > bugs >> > > > > or >> > > > > > > > > > suggest new >> > > > > > > > > > > > features, please open tickets >> > > > > > > > > > > > Apache Jira (https://issues.apache.org/jira). Note >> that >> > > > we >> > > > > > will >> > > > > > > > no >> > > > > > > > > > > > longer accept GitHub issues for these purposes. >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > Welcome to explore the new repository and >> documentation. >> > > > Your >> > > > > > > > feedback >> > > > > > > > > > and >> > > > > > > > > > > > contributions are invaluable as we continue to >> improve >> > > > Flink >> > > > > > CDC. >> > > > > > > > > > > > >> > > > > > > > > > > > Thanks everyone for your support and happy exploring >> Flink >> > > > > CDC! >> > > > > > > > > > > > >> > > > > > > > > > > > Best, >> > > > > > > > > > > > Leonard >> > > > > > > > > > > > [1] >> > > > > > > > >> https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > -- >> > > > > > > > > > > Best >> > > > > > > > > > > >> > > > > > > > > > > ConradJam >> > > > > > > > > > >> > > > > > > > >> > > > > > >
[jira] [Created] (FLINK-33734) Merge unaligned checkpoint
Feifan Wang created FLINK-33734: --- Summary: Merge unaligned checkpoint Key: FLINK-33734 URL: https://issues.apache.org/jira/browse/FLINK-33734 Project: Flink Issue Type: Improvement Reporter: Feifan Wang -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32908) Fix wrong materialization id setting when switching from non-log-based checkpoint to log-based checkpoint
Feifan Wang created FLINK-32908: --- Summary: Fix wrong materialization id setting when switching from non-log-based checkpoint to log-based checkpoint Key: FLINK-32908 URL: https://issues.apache.org/jira/browse/FLINK-32908 Project: Flink Issue Type: Bug Components: Runtime / State Backends Reporter: Feifan Wang Initial materialization ID should be set to checkpointID when switching from non-log-based checkpoint to log-based checkpoint. Currently initial id will be set to 0, which will cause the incremental snapshot of inner backend go wrong. PTAL [~Yanfei Lei] , [~roman] . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32901) Should set initial
Feifan Wang created FLINK-32901: --- Summary: Should set initial Key: FLINK-32901 URL: https://issues.apache.org/jira/browse/FLINK-32901 Project: Flink Issue Type: Bug Reporter: Feifan Wang -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [ANNOUNCE] New Apache Flink PMC Member - Matthias Pohl
Congrats Matthias! —— Name: Feifan Wang Email: zoltar9...@163.com Replied Message | From | Matthias Pohl | | Date | 08/7/2023 16:16 | | To | | | Subject | Re: [ANNOUNCE] New Apache Flink PMC Member - Matthias Pohl | Thanks everyone. :) On Mon, Aug 7, 2023 at 3:18 AM Andriy Redko wrote: Congrats Matthias, well deserved!! DC> Congrats Matthias! DC> Very well deserved, thankyou for your continuous, consistent contributions. DC> Welcome. DC> Thanks, DC> Danny DC> On Fri, Aug 4, 2023 at 9:30 AM Feng Jin wrote: Congratulations, Matthias! Best regards Feng On Fri, Aug 4, 2023 at 4:29 PM weijie guo wrote: Congratulations, Matthias! Best regards, Weijie Wencong Liu 于2023年8月4日周五 15:50写道: Congratulations, Matthias! Best, Wencong Liu At 2023-08-04 11:18:00, "Xintong Song" wrote: Hi everyone, On behalf of the PMC, I'm very happy to announce that Matthias Pohl has joined the Flink PMC! Matthias has been consistently contributing to the project since Sep 2020, and became a committer in Dec 2021. He mainly works in Flink's distributed coordination and high availability areas. He has worked on many FLIPs including FLIP195/270/285. He helped a lot with the release management, being one of the Flink 1.17 release managers and also very active in Flink 1.18 / 2.0 efforts. He also contributed a lot to improving the build stability. Please join me in congratulating Matthias! Best, Xintong (on behalf of the Apache Flink PMC)
Re: [ANNOUNCE] New Apache Flink Committer - Weihua Hu
Congratulations, Weihua! —— Name: Feifan Wang Email: zoltar9...@163.com Replied Message | From | Ming Li | | Date | 08/7/2023 16:52 | | To | | | Subject | Re: [ANNOUNCE] New Apache Flink Committer - Weihua Hu | Congratulations, Weihua! Best, Ming Li Matthias Pohl 于2023年8月7日周一 16:18写道: Congratulations, Weihua. On Mon, Aug 7, 2023 at 3:50 AM Yuepeng Pan wrote: Congratulations, Weihua! Best, Yuepeng Pan 在 2023-08-07 09:17:41,"yh z" 写道: Congratulations, Weihua! Best, Yunhong Zheng (Swuferhong) Runkang He 于2023年8月5日周六 21:34写道: Congratulations, Weihua! Best, Runkang He Kelu Tao 于2023年8月4日周五 18:21写道: Congratulations! On 2023/08/04 08:35:49 Danny Cranmer wrote: Congrats and welcome to the team, Weihua! Thanks, Danny On Fri, Aug 4, 2023 at 9:30 AM Feng Jin wrote: Congratulations Weihua! Best regards, Feng On Fri, Aug 4, 2023 at 4:28 PM weijie guo < guoweijieres...@gmail.com wrote: Congratulations Weihua! Best regards, Weijie Lijie Wang 于2023年8月4日周五 15:28写道: Congratulations, Weihua! Best, Lijie yuxia 于2023年8月4日周五 15:14写道: Congratulations, Weihua! Best regards, Yuxia - 原始邮件 - 发件人: "Yun Tang" 收件人: "dev" 发送时间: 星期五, 2023年 8 月 04日 下午 3:05:30 主题: Re: [ANNOUNCE] New Apache Flink Committer - Weihua Hu Congratulations, Weihua! Best Yun Tang From: Jark Wu Sent: Friday, August 4, 2023 15:00 To: dev@flink.apache.org Subject: Re: [ANNOUNCE] New Apache Flink Committer - Weihua Hu Congratulations, Weihua! Best, Jark On Fri, 4 Aug 2023 at 14:48, Yuxin Tan < tanyuxinw...@gmail.com wrote: Congratulations Weihua! Best, Yuxin Junrui Lee 于2023年8月4日周五 14:28写道: Congrats, Weihua! Best, Junrui Geng Biao 于2023年8月4日周五 14:25写道: Congrats, Weihua! Best, Biao Geng 发送自 Outlook for iOS<https://aka.ms/o0ukef> 发件人: 周仁祥 发送时间: Friday, August 4, 2023 2:23:42 PM 收件人: dev@flink.apache.org 抄送: Weihua Hu 主题: Re: [ANNOUNCE] New Apache Flink Committer - Weihua Hu Congratulations, Weihua~ 2023年8月4日 14:21,Sergey Nuyanzin < snuyan...@gmail.com> 写道: Congratulations, Weihua! On Fri, Aug 4, 2023 at 8:03 AM Chen Zhanghao < zhanghao.c...@outlook.com> wrote: Congratulations, Weihua! Best, Zhanghao Chen 发件人: Xintong Song 发送时间: 2023年8月4日 11:18 收件人: dev 抄送: Weihua Hu 主题: [ANNOUNCE] New Apache Flink Committer - Weihua Hu Hi everyone, On behalf of the PMC, I'm very happy to announce Weihua Hu as a new Flink Committer! Weihua has been consistently contributing to the project since May 2022. He mainly works in Flink's distributed coordination areas. He is the main contributor of FLIP-298 and many other improvements in large-scale job scheduling and improvements. He is also quite active in mailing lists, participating discussions and answering user questions. Please join me in congratulating Weihua! Best, Xintong (on behalf of the Apache Flink PMC) -- Best regards, Sergey
Re: [ANNOUNCE] New Apache Flink Committer - Hangxiang Yu
Congratulations Hangxiang! :) —— Name: Feifan Wang Email: zoltar9...@163.com Replied Message | From | Mang Zhang | | Date | 08/7/2023 18:56 | | To | | | Subject | Re:Re: [ANNOUNCE] New Apache Flink Committer - Hangxiang Yu | Congratulations-- Best regards, Mang Zhang 在 2023-08-07 18:18:08,"Yuxin Tan" 写道: Congrats, Hangxiang! Best, Yuxin weijie guo 于2023年8月7日周一 17:59写道: Congrats, Hangxiang! Best regards, Weijie Biao Geng 于2023年8月7日周一 17:04写道: Congrats, Hangxiang! Best, Biao Geng 发送自 Outlook for iOS<https://aka.ms/o0ukef> 发件人: Qingsheng Ren 发送时间: Monday, August 7, 2023 4:23:11 PM 收件人: dev@flink.apache.org 主题: Re: [ANNOUNCE] New Apache Flink Committer - Hangxiang Yu Congratulations and welcome aboard, Hangxiang! Best, Qingsheng On Mon, Aug 7, 2023 at 4:19 PM Matthias Pohl wrote: Congratulations, Hangxiang! :) On Mon, Aug 7, 2023 at 10:01 AM Junrui Lee wrote: Congratulations, Hangxiang! Best, Junrui Yun Tang 于2023年8月7日周一 15:19写道: Congratulations, Hangxiang! Best Yun Tang From: Danny Cranmer Sent: Monday, August 7, 2023 15:11 To: dev Subject: Re: [ANNOUNCE] New Apache Flink Committer - Hangxiang Yu Congrats Hangxiang! Welcome to the team. Danny. On Mon, 7 Aug 2023, 08:04 Rui Fan, <1996fan...@gmail.com> wrote: Congratulations Hangxiang! Best, Rui On Mon, Aug 7, 2023 at 2:58 PM Yuan Mei wrote: On behalf of the PMC, I'm happy to announce Hangxiang Yu as a new Flink Committer. Hangxiang has been active in the Flink community for more than 1.5 years and has played an important role in developing and maintaining State and Checkpoint related features/components, including Generic Incremental Checkpoints (take great efforts to make the feature prod-ready). Hangxiang is also the main driver of the FLIP-263: Resolving schema compatibility. Hangxiang is passionate about the Flink community. Besides the technical contribution above, he is also actively promoting Flink: talks about Generic Incremental Checkpoints in Flink Forward and Meet-up. Hangxiang also spent a good amount of time supporting users, participating in Jira/mailing list discussions, and reviewing code. Please join me in congratulating Hangxiang for becoming a Flink Committer! Thanks, Yuan Mei (on behalf of the Flink PMC)
Re: [ANNOUNCE] New Apache Flink Committer - Yanfei Lei
Congratulations Yanfei! :) —— Name: Feifan Wang Email: zoltar9...@163.com Replied Message | From | Matt Wang | | Date | 08/7/2023 19:40 | | To | dev@flink.apache.org | | Subject | Re: [ANNOUNCE] New Apache Flink Committer - Yanfei Lei | Congratulations Yanfei! -- Best, Matt Wang Replied Message | From | Mang Zhang | | Date | 08/7/2023 18:56 | | To | | | Subject | Re:Re: [ANNOUNCE] New Apache Flink Committer - Yanfei Lei | Congratulations-- Best regards, Mang Zhang 在 2023-08-07 18:17:58,"Yuxin Tan" 写道: Congrats, Yanfei! Best, Yuxin weijie guo 于2023年8月7日周一 17:59写道: Congrats, Yanfei! Best regards, Weijie Biao Geng 于2023年8月7日周一 17:03写道: Congrats, Yanfei! Best, Biao Geng 发送自 Outlook for iOS<https://aka.ms/o0ukef> 发件人: Qingsheng Ren 发送时间: Monday, August 7, 2023 4:23:52 PM 收件人: dev@flink.apache.org 主题: Re: [ANNOUNCE] New Apache Flink Committer - Yanfei Lei Congratulations and welcome, Yanfei! Best, Qingsheng On Mon, Aug 7, 2023 at 4:19 PM Matthias Pohl wrote: Congratulations, Yanfei! :) On Mon, Aug 7, 2023 at 10:00 AM Junrui Lee wrote: Congratulations Yanfei! Best, Junrui Yun Tang 于2023年8月7日周一 15:19写道: Congratulations, Yanfei! Best Yun Tang From: Danny Cranmer Sent: Monday, August 7, 2023 15:10 To: dev Subject: Re: [ANNOUNCE] New Apache Flink Committer - Yanfei Lei Congrats Yanfei! Welcome to the team. Danny On Mon, 7 Aug 2023, 08:03 Rui Fan, <1996fan...@gmail.com> wrote: Congratulations Yanfei! Best, Rui On Mon, Aug 7, 2023 at 2:56 PM Yuan Mei wrote: On behalf of the PMC, I'm happy to announce Yanfei Lei as a new Flink Committer. Yanfei has been active in the Flink community for almost two years and has played an important role in developing and maintaining State and Checkpoint related features/components, including RocksDB Rescaling Performance Improvement and Generic Incremental Checkpoints. Yanfei also helps improve community infrastructure in many ways, including migrating the Flink Daily performance benchmark to the Apache Flink slack channel. She is the maintainer of the benchmark and has improved its detection stability significantly. She is also one of the major maintainers of the FrocksDB Repo and released FRocksDB 6.20.3 (part of Flink 1.17 release). Yanfei is a very active community member, supporting users and participating in tons of discussions on the mailing lists. Please join me in congratulating Yanfei for becoming a Flink Committer! Thanks, Yuan Mei (on behalf of the Flink PMC)
[jira] [Created] (FLINK-32769) PeriodicMaterializationManager pass descriptionFormat with invalid placeholder to MailboxExecutor#execute()
Feifan Wang created FLINK-32769: --- Summary: PeriodicMaterializationManager pass descriptionFormat with invalid placeholder to MailboxExecutor#execute() Key: FLINK-32769 URL: https://issues.apache.org/jira/browse/FLINK-32769 Project: Flink Issue Type: Bug Components: Runtime / State Backends Reporter: Feifan Wang descriptionFormat in _MailboxExecutor#execute( ThrowingRunnable command, String descriptionFormat, Object... descriptionArgs)_ will be used in _String.format()_ which can't accept placeholder like "{}". But PeriodicMaterializationManager passed the descriptionFormat with invalid placeholder ‘{}’. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32761) Use md5 sum of PhysicalStateHandleID as SharedStateRegistryKey ChangelogStateHandleStreamImpl
Feifan Wang created FLINK-32761: --- Summary: Use md5 sum of PhysicalStateHandleID as SharedStateRegistryKey ChangelogStateHandleStreamImpl Key: FLINK-32761 URL: https://issues.apache.org/jira/browse/FLINK-32761 Project: Flink Issue Type: Improvement Components: Runtime / State Backends Reporter: Feifan Wang _ChangelogStateHandleStreamImpl#getKey()_ use _System.identityHashCode(stateHandle)_ as _SharedStateRegistryKey_ while stateHandle is not _FileStateHandle_ or {_}ByteStreamStateHandle{_}. That can easily lead to collision, although from the current code path, it only affects the test code. In FLINK-29913 , we use md5 sum of PhysicalStateHandleID as SharedStateRegistryKey in IncrementalRemoteKeyedStateHandle, we can reuse this method in ChangelogStateHandleStreamImpl. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [ANNOUNCE] New Apache Flink Committer - Yong Fang
Congratulations, Yong Fang! Best, Feifan Wang | | Feifan Wang | | zoltar9...@163.com | Replied Message | From | Yuxin Tan | | Date | 07/26/2023 10:25 | | To | | | Subject | Re: Re: [ANNOUNCE] New Apache Flink Committer - Yong Fang | Congratulations, Yong Fang! Best, Yuxin Yanfei Lei 于2023年7月26日周三 10:18写道: Congratulations! Best regards, Yanfei weijie guo 于2023年7月26日周三 10:10写道: Congrats, Yong Fang! Best regards, Weijie Danny Cranmer 于2023年7月26日周三 03:34写道: Congrats and welcome! Danny. On Tue, 25 Jul 2023, 16:48 Matthias Pohl, wrote: Congratulations :) On Tue, Jul 25, 2023 at 5:13 PM Jing Ge wrote: Congrats, Yong Fang! Best regards, Jing On Tue, Jul 25, 2023 at 7:35 PM Yu Li wrote: Congrats, Yong! Best Regards, Yu On Tue, 25 Jul 2023 at 18:03, Sergey Nuyanzin < snuyan...@gmail.com> wrote: Congratulations, Yong Fang! On Tue, Jul 25, 2023 at 7:53 AM ConradJam 于2023年7月25日周二 12:08写道: Congratulations, Yong Fang! -- Best regards, Mang Zhang 在 2023-07-25 10:30:24,"Jark Wu" 写道: Congratulations, Yong Fang! Best, Jark On Mon, 24 Jul 2023 at 22:11, Wencong Liu < liuwencle...@163.com wrote: Congratulations! Best, Wencong Liu 在 2023-07-24 11:03:30,"Paul Lam"
Re: [DISCUSS] FLIP 333 - Redesign Apache Flink website
+1 , the new design looks more attractive and is well organized | | Feifan Wang | | zoltar9...@163.com | Replied Message | From | Leonard Xu | | Date | 07/11/2023 16:34 | | To | dev | | Subject | Re: [DISCUSS] FLIP 333 - Redesign Apache Flink website | +1 for the redesigning, the new website looks cool. Best, Leonard On Jul 11, 2023, at 7:55 AM, Mohan, Deepthi wrote: Hi, I’m opening this thread to discuss a proposal to redesign the Apache Flink website: https://flink.apache.org. The approach and a few initial mockups are included in FLIP 333 - Redesign Apache Flink website.<https://cwiki.apache.org/confluence/display/FLINK/FLIP-333%3A+Redesign+Apache+Flink+website> The goal is to modernize the website design to help existing and new users easily understand Flink’s value proposition and make Flink attractive to new users. As suggested in a previous thread, there are no proposed changes to Flink documentation. I look forward to your feedback and the discussion. Thanks, Deepthi
[jira] [Created] (FLINK-32141) SharedStateRegistry print too much info log
Feifan Wang created FLINK-32141: --- Summary: SharedStateRegistry print too much info log Key: FLINK-32141 URL: https://issues.apache.org/jira/browse/FLINK-32141 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.17.0 Reporter: Feifan Wang Fix For: 1.17.1 Attachments: image-2023-05-21-00-26-20-026.png FLINK-29095 added some log to SharedStateRegistry for trouble shooting. Among them, a info log be added when newHandle is equal to the registered one: [https://github.com/apache/flink/blob/release-1.17.0/flink-runtime/src/main/java/org/apache/flink/runtime/state/SharedStateRegistryImpl.java#L117] !image-2023-05-21-00-26-20-026.png|width=775,height=126! But this case cannot be considered as a possible bug, because FsStateChangelogStorage will directly use the FileStateHandle of the previous checkpoint instead of PlaceholderStreamStateHandle. In our tests, JobManager printed so much of this log that useful information was overwhelmed. So I suggest change this log level to trace, WDYT [~Yanfei Lei], [~klion26] ? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32135) FRocksDB fix log file create failed caused by file name too long
Feifan Wang created FLINK-32135: --- Summary: FRocksDB fix log file create failed caused by file name too long Key: FLINK-32135 URL: https://issues.apache.org/jira/browse/FLINK-32135 Project: Flink Issue Type: Bug Components: Runtime / State Backends Reporter: Feifan Wang RocksDB use instance path as log file name when specifying log path, but if instance path is too long to exceed filesystem's limit, log file creation will fail. We disable log relocating when RocksDB instance path is too long in FLINK-31743, but that's just a hotfix. This ticket proposal save this problem on FrocksDB. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32130) previous checkpoint will be broke by the subsequent incremental checkpoint
Feifan Wang created FLINK-32130: --- Summary: previous checkpoint will be broke by the subsequent incremental checkpoint Key: FLINK-32130 URL: https://issues.apache.org/jira/browse/FLINK-32130 Project: Flink Issue Type: Bug Reporter: Feifan Wang Currently, _SharedStateRegistryImpl_ will discard old one while register new state to same key: {code:java} // Old entry is not in a confirmed checkpoint yet, and the new one differs. // This might result from (omitted KG range here for simplicity): // 1. Flink recovers from a failure using a checkpoint 1 // 2. State Backend is initialized to UID xyz and a set of SST: { 01.sst } // 3. JM triggers checkpoint 2 // 4. TM sends handle: "xyz-002.sst"; JM registers it under "xyz-002.sst" // 5. TM crashes; everything is repeated from (2) // 6. TM recovers from CP 1 again: backend UID "xyz", SST { 01.sst } // 7. JM triggers checkpoint 3 // 8. TM sends NEW state "xyz-002.sst" // 9. JM discards it as duplicate // 10. checkpoint completes, but a wrong SST file is used // So we use a new entry and discard the old one: LOG.info( "Duplicated registration under key {} of a new state: {}. " + "This might happen during the task failover if state backend creates different states with the same key before and after the failure. " + "Discarding the OLD state and keeping the NEW one which is included into a completed checkpoint", registrationKey, newHandle); scheduledStateDeletion = entry.stateHandle; entry.stateHandle = newHandle; {code} But if _execution.checkpointing.max-concurrent-checkpoints_ > 1, the following case will fail (take _RocksDBStateBackend_ as an example): # cp1 trigger: 1.sst be uploaded to file-1, and register <1.sst,file-1>, cp1 reference file-1 # cp1 is not yet complete, cp2 trigger: 1.sst be uploaded to file-2, and try register <1.sst,file-2>. SharedStateRegistry discard file-1 # cp1 completed and cp2 failed, but the cp1 is broken (file-1 has be deleted) I think we should allow register multi state object to same key, WDYT [~pnowojski], [~roman] ? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31900) Fix some typo in java doc, comments and assertion message
Feifan Wang created FLINK-31900: --- Summary: Fix some typo in java doc, comments and assertion message Key: FLINK-31900 URL: https://issues.apache.org/jira/browse/FLINK-31900 Project: Flink Issue Type: Bug Components: Documentation Reporter: Feifan Wang As the title. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31414) exceptions in the alignment timer are ignored
Feifan Wang created FLINK-31414: --- Summary: exceptions in the alignment timer are ignored Key: FLINK-31414 URL: https://issues.apache.org/jira/browse/FLINK-31414 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Reporter: Feifan Wang Alignment timer task in alternating aligned checkpoint run as a future task in mailbox thread, leads to exceptions ([SingleCheckpointBarrierHandler#registerAlignmentTimer()|https://github.com/apache/flink/blob/65ab8e820a3714d2134dfb4c9772a10c998bd45a/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/io/checkpointing/SingleCheckpointBarrierHandler.java#L327]) are ignored. see : [BarrierAlignmentUtil#createRegisterTimerCallback()|https://github.com/apache/flink/blob/65ab8e820a3714d2134dfb4c9772a10c998bd45a/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/io/checkpointing/BarrierAlignmentUtil.java#L50] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31139) not upload empty state changelog file
Feifan Wang created FLINK-31139: --- Summary: not upload empty state changelog file Key: FLINK-31139 URL: https://issues.apache.org/jira/browse/FLINK-31139 Project: Flink Issue Type: Bug Components: Runtime / State Backends Reporter: Feifan Wang Fix For: 1.16.2 Attachments: image-2023-02-20-19-51-34-397.png h1. Problem *_BatchingStateChangeUploadScheduler_* will upload many empty changelog files (file size == 1 and only contains compressed flag). !image-2023-02-20-19-51-34-397.png|width=1062,height=188! These files are not referenced by any checkpoints, are not cleaned up, and become more numerous as the job runs. Taking our big job as an example, 2292 such files were generated within 7 hours. It only takes about 4 months and the number of files in the changelog directory will exceed a million. h1. Problem causes This problem is caused by *_BatchingStateChangeUploadScheduler#drainAndSave_* not checking whether the task collection is empty. The data in the scheduled queue may have been uploaded when the _*BatchingStateChangeUploadScheduler#drainAndSave*_ method is executed. So we should check whether the task collection is empty in *_BatchingStateChangeUploadScheduler#drainAndSave_* . WDYT [~roman] , [~Yanfei Lei] ? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30792) clean up not uploaded state changes after materialization complete
Feifan Wang created FLINK-30792: --- Summary: clean up not uploaded state changes after materialization complete Key: FLINK-30792 URL: https://issues.apache.org/jira/browse/FLINK-30792 Project: Flink Issue Type: Bug Components: Runtime / State Backends Affects Versions: 1.16.0 Reporter: Feifan Wang We should clean up not uploaded state changes after materialization completed, otherwise it will cause (status quo) : # subsequent checkpoints contain wrong state changes which before completed materialization # FileNotFound exception may occur when recovering from the above problematic checkpoint, because the state change files before completed materialization may have been deleted with the checkpoint subsuming. Since state changes before completed materialization in FsStateChangelogWriter#notUploaded will not be used in any subsequent checkpoint, I suggest clean up it while handle materialization result. How do you think about this ? [~ym] , [~roman] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30561) ChangelogStreamHandleReaderWithCache cause FileNotFoundException
Feifan Wang created FLINK-30561: --- Summary: ChangelogStreamHandleReaderWithCache cause FileNotFoundException Key: FLINK-30561 URL: https://issues.apache.org/jira/browse/FLINK-30561 Project: Flink Issue Type: Bug Components: Runtime / State Backends Affects Versions: 1.16.0 Reporter: Feifan Wang When a job with state changelog enabled continues to restart, the following exceptions may occur : {code:java} java.lang.RuntimeException: java.io.FileNotFoundException: /data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp (No such file or directory) at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87) at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69) at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:107) at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:78) at org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:94) at org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336) at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353) at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:265) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:726) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:702) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:669) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.FileNotFoundException: /data1/hadoop/yarn/nm-local-dir/usercache/hadoop-rt/appcache/application_1671689962742_192/dstl-cache-file/dstl6215344559415829831.tmp (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.<init>(FileInputStream.java:138) at org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:158) at org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:95) at org.apache.flink.changelog.fs.StateChangeIteratorImpl.read(StateChangeIteratorImpl.java:42) at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85) ... 21 more {code} *Problem causes:* # *_ChangelogStreamHandleReaderWithCache_* use RefCountedFile manager local cache file. The reference count is incremented when the input stream is opened from the cache file, and decremented by one when the input stream is closed. So the input stream must be closed and only once. # _*StateChangelogHandleStreamHandleReader#getChanges()*_ may cause the input stream to be closed twice. This happens when changeIterator.read(tuple2.f0, tuple2.f1) throws an exception (for example, when the task is canceled for other reasons during the restore process) the current state change iterator will be closed twice. {code:java} private void advance() { while (!current.h
[jira] [Created] (FLINK-29822) Fix wrong description in comments of StreamExecutionEnvironment#setMaxParallelism()
Feifan Wang created FLINK-29822: --- Summary: Fix wrong description in comments of StreamExecutionEnvironment#setMaxParallelism() Key: FLINK-29822 URL: https://issues.apache.org/jira/browse/FLINK-29822 Project: Flink Issue Type: Bug Reporter: Feifan Wang The upper limit (inclusive) of max parallelism is Short.MAX_VALUE + 1, not Short.MAX_VALUE. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29526) Java doc mistake in SequenceNumberRange#contains()
Feifan Wang created FLINK-29526: --- Summary: Java doc mistake in SequenceNumberRange#contains() Key: FLINK-29526 URL: https://issues.apache.org/jira/browse/FLINK-29526 Project: Flink Issue Type: Bug Components: Runtime / State Backends Reporter: Feifan Wang Attachments: image-2022-10-06-10-50-16-927.png !image-2022-10-06-10-50-16-927.png|width=554,height=106! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29244) Add metric lastMaterializationDuration to ChangelogMaterializationMetricGroup
Feifan Wang created FLINK-29244: --- Summary: Add metric lastMaterializationDuration to ChangelogMaterializationMetricGroup Key: FLINK-29244 URL: https://issues.apache.org/jira/browse/FLINK-29244 Project: Flink Issue Type: Improvement Components: Runtime / State Backends Reporter: Feifan Wang Materialization duration can help us evaluate the efficiency of materialization and the impact on the job. How do you think about ? [~roman] -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re:[DISCUSS] Introduce multi delete API to Flink's FileSystem class
Thanks a lot for the proposal @Yun Tang ! It sounds great and I can't find any reason not to make this improvement. —— Name: Feifan Wang Email: zoltar9...@163.com Replied Message | From | Yun Tang | | Date | 06/30/2022 16:56 | | To | dev@flink.apache.org | | Subject | [DISCUSS] Introduce multi delete API to Flink's FileSystem class | Hi guys, As more and more teams move to cloud-based environments. Cloud object storage has become the factual technical standard for big data ecosystems. From our experience, the performance of writing/deleting objects in object storage could vary in each call, the FLIP of changelog state-backend had ever taken experiments to verify the performance of writing the same data with multi times [1], and it proves that p999 latency could be 8x than p50 latency. This is also true for delete operations. Currently, after introducing the checkpoint backpressure mechanism[2], the newly triggered checkpoint could be delayed due to not cleaning checkpoints as fast as possible [3]. Moreover, Flink's checkpoint cleanup mechanism cannot leverage deleting folder API to speed up the procedure with incremental checkpoints[4]. This is extremely obvious in cloud object storage, and all most all object storage SDKs have multi-delete API to accelerate the performance, e.g. AWS S3 [5], Aliyun OSS [6], and Tencentyun COS [7]. A simple experiment shows that deleting 1000 objects with each 5MB size, will cost 39494ms with for-loop single delete operations, and the result will drop to 1347ms if using multi-delete API in Tencent Cloud. However, Flink's FileSystem API refers to the HDFS's FileSystem API and lacks such a multi-delete API, which is somehow outdated currently in cloud-based environments. Thus I suggest adding such a multi-delete API to Flink's FileSystem[8] class and file systems that do not support such a multi-delete feature will roll back to a for-loop single delete. By doing so, we can at least accelerate the speed of discarding checkpoints in cloud environments. WDYT? [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints#FLIP158:Generalizedincrementalcheckpoints-DFSwritelatency [2] https://issues.apache.org/jira/browse/FLINK-17073 [3] https://issues.apache.org/jira/browse/FLINK-26590 [4] https://github.com/apache/flink/blob/1486fee1acd9cd1e340f6d2007f723abd20294e5/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java#L315 [5] https://docs.aws.amazon.com/AmazonS3/latest/userguide/delete-multiple-objects.html [6] https://www.alibabacloud.com/help/en/object-storage-service/latest/delete-objects-8#section-v6n-zym-tax [7] https://intl.cloud.tencent.com/document/product/436/44018#delete-objects-in-batch [8] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java Best Yun Tang
[jira] [Created] (FLINK-28178) Show the delegated StateBackend and whether changelog is enabled in the UI
Feifan Wang created FLINK-28178: --- Summary: Show the delegated StateBackend and whether changelog is enabled in the UI Key: FLINK-28178 URL: https://issues.apache.org/jira/browse/FLINK-28178 Project: Flink Issue Type: Improvement Reporter: Feifan Wang If changelog is enabled, StateBackend shown in Web UI is always 'ChangelogStateBackend'. I think ChangelogStateBackend should not expose to user, we should show the delegated StateBackend in this place. And We should add add a row to indicate whether changelog is enabled. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-28172) Scatter dstl files into separate directories by job id
Feifan Wang created FLINK-28172: --- Summary: Scatter dstl files into separate directories by job id Key: FLINK-28172 URL: https://issues.apache.org/jira/browse/FLINK-28172 Project: Flink Issue Type: Improvement Components: Runtime / State Backends Affects Versions: 1.15.0 Reporter: Feifan Wang In the current implementation of {_}FsStateChangelogStorage{_}, dstl files from all jobs are put into the same directory (configured via {_}dstl.dfs.base-path{_}). Everything is fine if it's a filesystem like S3.But if it is a file system like hadoop, there will be some problems. First, there may be an upper limit to the number of files in a single directory. Increasing this threshold will greatly reduce the performance of the distributed file system. Second, dstl file management becomes difficult because the user cannot tell which job the dstl file belongs to, especially when the retained checkpoint is turned on. h3. Propose # create a subdirectory named with the job id under the _dstl.dfs.base-path_ directory when the job starts # all dstl files upload to the subdirectory ( Going a step further, we can even create two levels of subdirectories under the _dstl.dfs.base-path_ directory, like _base-path/\{jobId}/dstl ._ This way, if the user configures the same dstl.dfs.base-path as state.checkpoints.dir, all files needed for job recovery will be in the same directory and well organized. ) -- This message was sent by Atlassian Jira (v8.20.7#820007)
Re: [ANNOUNCE] New Apache Flink Committers: Qingsheng Ren, Shengkai Fang
Congratulations, Qingsheng and ShengKai. Best, Feifan —— Name: Feifan Wang Email: zoltar9...@163.com Replied Message | From | Zakelly Lan | | Date | 06/20/2022 16:51 | | To | | | Subject | Re: [ANNOUNCE] New Apache Flink Committers: Qingsheng Ren, Shengkai Fang | Congrats, Qingsheng and Shengkai. Best, Zakelly On Mon, Jun 20, 2022 at 4:50 PM Geng Biao wrote: Congrats to Qingsheng and Shengkai! Best, Biao Geng 获取 Outlook for iOS<https://aka.ms/o0ukef> 发件人: Zhilong Hong 发送时间: Monday, June 20, 2022 4:05:48 PM 收件人: dev@flink.apache.org 抄送: Qingsheng Ren ; Shengkai Fang 主题: Re: [ANNOUNCE] New Apache Flink Committers: Qingsheng Ren, Shengkai Fang Congratulations, Qingsheng and ShengKai! Best, Zhilong. On Mon, Jun 20, 2022 at 4:00 PM Lijie Wang wrote: Congratulations, Qingsheng and ShengKai. Best, Lijie Paul Lam 于2022年6月20日周一 15:58写道: Congrats, Qingsheng and Shengkai! Best, Paul Lam 2022年6月20日 15:57,Martijn Visser 写道: Congratulations to both of you, this is very much deserved! Op ma 20 jun. 2022 om 09:53 schreef Jark Wu : Hi everyone, On behalf of the PMC, I'm very happy to announce two new Flink committers: Qingsheng Ren and Shengkai Fang. Qingsheng is the core contributor and maintainer of the Kafka connector. He continuously improved the existing connectors, debugged many connector testability issues, and worked on the connector testing framework. Recently, he is driving the work of FLIP-221 (caching lookup connector), which is crucial for SQL connectors. Shengkai has been continuously contributing to the Flink project for two years. He mainly works on Flink SQL parts and drives several important FLIPs, e.g. FLIP-149 (upsert-kafka), FLIP-163 (SQL CLI Improvements), FLIP-91 (SQL Gateway), and FLIP-223 (HiveServer2 Endpoint). He is very active and helps many users on the mailing list. Please join me in welcoming them as committers! Cheers, Jark Wu
Re: [DISCUSS ] Make state.backend.incremental as true by default
Thanks for bringing this up. Strongly +1 —— Name: Feifan Wang Email: zoltar9...@163.com Replied Message | From | Yuan Mei | | Date | 06/15/2022 11:41 | | To | dev , | | Subject | Re: [DISCUSS ] Make state.backend.incremental as true by default | Thanks for bringing this up. I am +1 on making incremental checkpoints by default for RocksDB, but not universally for all state backends. Besides being widely used in prod, enabling incremental checkpoint for RocksDB by default is also a pre-requisite when enabling task-local by default FLINK-15507 <https://issues.apache.org/jira/browse/FLINK-15507> The incremental checkpoint for the hashmap statebackend is under review right now. CC @ro...@ververica.com , which is not a good idea being enabled by default in the first version. Best, Yuan On Tue, Jun 14, 2022 at 7:33 PM Jiangang Liu wrote: +1 for the suggestion. We have use the incremental checkpoint in our production for a long time. Hangxiang Yu 于2022年6月14日周二 15:41写道: +1 It's basically enabled in most scenarios in production environments. For HashMapStateBackend, it will adopt a full checkpoint even if we enable incremental checkpoint. It will also support incremental checkpoint after [1]. It's compatible. BTW, I think we may also need to improve the documentation of incremental checkpoints which users usually ask. There are some tickets like [2][3]. Best, Hangxiang. [1] https://issues.apache.org/jira/browse/FLINK-21648 [2] https://issues.apache.org/jira/browse/FLINK-22797 [3] https://issues.apache.org/jira/browse/FLINK-7449 On Mon, Jun 13, 2022 at 7:48 PM Rui Fan <1996fan...@gmail.com> wrote: Strongly +1 Best, Rui Fan On Mon, Jun 13, 2022 at 7:35 PM Martijn Visser < martijnvis...@apache.org wrote: BTW, from my knowledge, nothing would happen for HashMapStateBackend, which does not support incremental checkpoint yet, when enabling incremental checkpoints. Thanks Yun, if no errors would occur then definitely +1 to enable it by default Op ma 13 jun. 2022 om 12:42 schreef Alexander Fedulov < alexan...@ververica.com>: +1 From my experience, it is actually hard to come up with use cases where incremental checkpoints should explicitly not be enabled with the RocksDB state backend. If the state is so small that the full snapshots do not have any negative impact, one should consider using HashMapStateBackend anyway. Best, Alexander Fedulov On Mon, Jun 13, 2022 at 12:26 PM Jing Ge wrote: +1 Glad to see the kickoff of this discussion. Thanks Lihe for driving this! We have actually already discussed it internally a few months ago. After considering some corner cases, all agreed on enabling the incremental checkpoint as default. Best regards, Jing On Mon, Jun 13, 2022 at 12:17 PM Yun Tang wrote: Strongly +1 for making incremental checkpoints as default. Many users have ever been asking why this configuration is not enabled by default. BTW, from my knowledge, nothing would happen for HashMapStateBackend, which does not support incremental checkpoint yet, when enabling incremental checkpoints. Best Yun Tang From: Martijn Visser Sent: Monday, June 13, 2022 18:05 To: dev@flink.apache.org Subject: Re: [DISCUSS ] Make state.backend.incremental as true by default Hi Lihe, What happens if we enable incremental checkpoints by default while the used memory backend is HashMapStateBackend, which doesn't support incremental checkpoints? Best regards, Martijn Op ma 13 jun. 2022 om 11:59 schreef Lihe Ma : Hi, Everyone, I would like to open a discussion on setting incremental checkpoint as default behavior. Currently, the configuration of state.backend.incremental is set as false by default. Incremental checkpoint has been adopted widely in industry community for many years , and it is also well-tested from the feedback in the community discussion. Incremental checkpointing is more light-weighted: shorter checkpoint duration, less uploaded data and less resource consumption. In terms of backward compatibility, enable incremental checkpointing would not make any data loss no matter restoring from a full checkpoint/savepoint or an incremental checkpoint. FLIP-193 (Snapshot ownership)[1] has been released in 1.15, incremental checkpoint no longer depends on a previous restored checkpoint in default NO_CLAIM mode, which makes the checkpoint lineage much cleaner, it is a good chance to change the configuration state.backend.incremental to true as default. Thus, based on the above discussion, I suggest to make state.backend.incremental as true by default. What do you think of this proposal? [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership Best regards, Lihe Ma
[jira] [Created] (FLINK-27841) RocksDB cache miss increase in 1.15
Feifan Wang created FLINK-27841: --- Summary: RocksDB cache miss increase in 1.15 Key: FLINK-27841 URL: https://issues.apache.org/jira/browse/FLINK-27841 Project: Flink Issue Type: Improvement Components: Runtime / State Backends Affects Versions: 1.15.0 Reporter: Feifan Wang Attachments: image-2022-05-31-00-21-49-338.png, image-2022-05-31-00-22-45-123.png I run same job with 1.12.2 and 1.15.0 , find that cpu busy much higher than 1.12. After careful comparison, I find higher cache miss in 1.15. But block cache size is same. Is this as expected ? I know RocksDB version in 1.15 is 6.20.3 and 1.12 is 5.17.2, is this just caused by different versions of RocksDB ? !image-2022-05-31-00-21-49-338.png|width=599,height=146! !image-2022-05-31-00-22-45-123.png|width=594,height=175! test information: job type : regular join parallelism : 160 TaskManager memory : 32G num of TaskManager : 10 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27734) Not showing checkpoint interval properly in WebUI when checkpoint is disabled
Feifan Wang created FLINK-27734: --- Summary: Not showing checkpoint interval properly in WebUI when checkpoint is disabled Key: FLINK-27734 URL: https://issues.apache.org/jira/browse/FLINK-27734 Project: Flink Issue Type: Bug Components: Runtime / Web Frontend Reporter: Feifan Wang Attachments: image-2022-05-22-23-42-46-365.png Not showing checkpoint interval properly in WebUI when checkpoint is disabled !image-2022-05-22-23-42-46-365.png|width=1019,height=362! -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27187) The attemptsPerUpload metric may be lower than it actually is
Feifan Wang created FLINK-27187: --- Summary: The attemptsPerUpload metric may be lower than it actually is Key: FLINK-27187 URL: https://issues.apache.org/jira/browse/FLINK-27187 Project: Flink Issue Type: Bug Components: Runtime / State Backends Reporter: Feifan Wang The attemptsPerUpload metric in ChangelogStorageMetricGroup indicate distributions of number of attempts per upload. In the current implementation, each successful attempt try to update attemptsPerUpload with its attemptNumber. But consider this case: # attempt 1 timeout, then schedule attempt 2 # attempt 1 completed before attempt 2 and update attemptsPerUpload with 1 In fact there are two attempts, but attemptsPerUpload updated with 1. So, I think we should add "actionAttemptsCount" to RetryExecutor.RetriableActionAttempt, this field shared across all attempts to execute the same upload action representing the number of upload attempts. And completed attempt should use this field update attemptsPerUpload. How do you think about ? [~ym] , [~roman] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27155) Reduce multiple reads to the same Changelog file in the same taskmanager during restore
Feifan Wang created FLINK-27155: --- Summary: Reduce multiple reads to the same Changelog file in the same taskmanager during restore Key: FLINK-27155 URL: https://issues.apache.org/jira/browse/FLINK-27155 Project: Flink Issue Type: Improvement Components: Runtime / Checkpointing, Runtime / State Backends Reporter: Feifan Wang h3. Background In the current implementation, State changes of different operators in the same taskmanager may be written to the same changelog file, which effectively reduces the number of files and requests to DFS. But on the other hand, the current implementation also reads the same changelog file multiple times on recovery. More specifically, the number of times the same changelog file is accessed is related to the number of ChangeSets contained in it. And since each read needs to skip the preceding bytes, this network traffic is also wasted. The result is a lot of unnecessary request to DFS when there are multiple slots and keyed state in the same taskmanager. h3. Proposal We can reduce multiple reads to the same changelog file in the same taskmanager during restore. One possible approach is to read the changelog file all at once and cache it in memory or local file for a period of time when reading the changelog file. I think this could be a subtask of [v2 FLIP-158: Generalized incremental checkpoints|https://issues.apache.org/jira/browse/FLINK-25842] . Hi [~ym] , [~roman] how do you think about ? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27105) wrong metric type
Feifan Wang created FLINK-27105: --- Summary: wrong metric type Key: FLINK-27105 URL: https://issues.apache.org/jira/browse/FLINK-27105 Project: Flink Issue Type: Bug Reporter: Feifan Wang -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-26799) StateChangeFormat#read not seek to offset correctly
Feifan Wang created FLINK-26799: --- Summary: StateChangeFormat#read not seek to offset correctly Key: FLINK-26799 URL: https://issues.apache.org/jira/browse/FLINK-26799 Project: Flink Issue Type: Bug Components: Runtime / State Backends Reporter: Feifan Wang StateChangeFormat#read must seek to offset before read, current implement as follows : {code:java} FSDataInputStream stream = handle.openInputStream(); DataInputViewStreamWrapper input = wrap(stream); if (stream.getPos() != offset) { LOG.debug("seek from {} to {}", stream.getPos(), offset); input.skipBytesToRead((int) offset); }{code} But the if condition is incorrect, stream.getPos() return the position of underlying stream which is different from position of input. By the way, because of wrapped by BufferedInputStream, position of underlying stream always at n*bufferSize or the end of file. Actually, input is aways at position 0 at beginning, so I think we can seek to the offset directly. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-26766) Mistake in ChangelogStateHandleStreamImpl#getIntersection
Feifan Wang created FLINK-26766: --- Summary: Mistake in ChangelogStateHandleStreamImpl#getIntersection Key: FLINK-26766 URL: https://issues.apache.org/jira/browse/FLINK-26766 Project: Flink Issue Type: Bug Components: Runtime / State Backends Reporter: Feifan Wang Maybe mistake in ChangelogStateHandleStreamImpl : {code:java} public KeyedStateHandle getIntersection(KeyGroupRange keyGroupRange) { KeyGroupRange offsets = keyGroupRange.getIntersection(keyGroupRange); // .. } {code} I guess should be : KeyGroupRange offsets = {color:#de350b}this{color}.keyGroupRange.getIntersection(keyGroupRange); Hi [~roman] , can you help confirm that ? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-24562) YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance
Feifan Wang created FLINK-24562: --- Summary: YarnResourceManagerDriverTest should not use ContainerStatusPBImpl.newInstance Key: FLINK-24562 URL: https://issues.apache.org/jira/browse/FLINK-24562 Project: Flink Issue Type: Improvement Components: Deployment / YARN Affects Versions: 1.14.0 Reporter: Feifan Wang In YarnResourceManagerDriverTest, we create ContainerStatus with the static method ContainerStatusPBImpl{{.newInstance}}, which is annotated as private and unstable. Although this method is still available in the latest version of yarn, some third-party versions of yarn may modify it. In fact, this method was modified in the internal version provided by our yarn team, which caused flink-1.14.0 to fail to compile. Moreover, there is already an org.apache.flink.yarn.TestingContainerStatus, I think we should use it directly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24384) Count checkpoints failed in trigger phase into numberOfFailedCheckpoints
Feifan Wang created FLINK-24384: --- Summary: Count checkpoints failed in trigger phase into numberOfFailedCheckpoints Key: FLINK-24384 URL: https://issues.apache.org/jira/browse/FLINK-24384 Project: Flink Issue Type: Improvement Components: Runtime / Checkpointing Reporter: Feifan Wang h1. *Problem* In current implementation, checkpoints failed in trigger phase do not count into metric 'numberOfFailedCheckpoints'. Such that users can not aware checkpoint stoped by this metric. As lang as users can use rules like _*'numberOfCompletedCheckpoints' not increase in some minutes past*_ (maybe checkpoint interval + timeout) for alerting, but I think it is ambages and can not alert timely. h1. *Proposal* As the title, count checkpoints failed in trigger phase into 'numberOfFailedCheckpoints'. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24274) Wrong parameter order in documentation of State Processor API
Feifan Wang created FLINK-24274: --- Summary: Wrong parameter order in documentation of State Processor API Key: FLINK-24274 URL: https://issues.apache.org/jira/browse/FLINK-24274 Project: Flink Issue Type: Improvement Components: Documentation Reporter: Feifan Wang Attachments: image-2021-09-14-02-09-44-334.png, image-2021-09-14-02-11-12-034.png Wrong order of parameters path and stateBackend in example code of [State Processor Api # modifying-savepoints|https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/libs/state_processor_api/#modifying-savepoints] !image-2021-09-14-02-09-44-334.png|width=489,height=126! !image-2021-09-14-02-11-12-034.png|width=478,height=222! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24159) document of entropy injection may mislead users
Feifan Wang created FLINK-24159: --- Summary: document of entropy injection may mislead users Key: FLINK-24159 URL: https://issues.apache.org/jira/browse/FLINK-24159 Project: Flink Issue Type: Improvement Components: Documentation, Runtime / Checkpointing Reporter: Feifan Wang FLINK-9061 incroduce entropy inject to s3 path for better scalability, but in document of [entropy-injection-for-s3-file-systems|https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#entropy-injection-for-s3-file-systems] use a example with checkpoint directory "{color:#FF}s3://my-bucket/checkpoints/_entropy_/dashboard-job/{color}", with this configuration every checkpoint key will still start with constant checkpoints/ prefix which actually reduces scalability. Thanks to dmtolpeko for describing this issue in his blog ( [flink-and-s3-entropy-injection-for-checkpoints |http://cloudsqale.com/2021/01/02/flink-and-s3-entropy-injection-for-checkpoints/]). h3. Proposal alter the checkpoint directory in document of [entropy-injection-for-s3-file-systems|https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#entropy-injection-for-s3-file-systems] to "{color:#FF}s3://my-bucket/_entropy_/checkpoints/dashboard-job/{color}" (make entropy key at start of keys). If this proposal is appropriate, I am glad to submit a PR to modify the document here. Any other ideas for this ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24149) Make checkpoint relocatable
Feifan Wang created FLINK-24149: --- Summary: Make checkpoint relocatable Key: FLINK-24149 URL: https://issues.apache.org/jira/browse/FLINK-24149 Project: Flink Issue Type: Improvement Components: Runtime / Checkpointing Reporter: Feifan Wang h3. Backgroud FLINK-5763 proposal make savepoint relocatable, checkpoint has similar requirements. For example, to migrate jobs to other HDFS clusters, although it can be achieved through a savepoint, but we prefer to use persistent checkpoints, especially RocksDBStateBackend incremental checkpoints have better performance than savepoint during snapshot and restore. FLINK-8531 standardized directory layout : {code:java} /user-defined-checkpoint-dir | + 1b080b6e710aabbef8993ab18c6de98b (job's ID) | + --shared/ + --taskowned/ + --chk-1/ + --chk-2/ + --chk-3/ ... {code} * State backend will create a subdirectory with the job's ID that will contain the actual checkpoints, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/ * Each checkpoint individually will store all its files in a subdirectory that includes the checkpoint number, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/chk-3/ * Files shared between checkpoints will be stored in the shared/ directory in the same parent directory as the separate checkpoint directory, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/shared/ * Similar to shared files, files owned strictly by tasks will be stored in the taskowned/ directory in the same parent directory as the separate checkpoint directory, such as: user-defined-checkpoint-dir/1b080b6e710aabbef8993ab18c6de98b/taskowned/ h3. Proposal Since the individually checkpoint directory does not contain complete state data, we cannot make it relocatable, but its parent directory can. The only work left is make the metadata file references relative file paths. I proposal make these changes to _*FsCheckpointStateOutputStream*_ : * introduce _*checkpointDirectory*_ field * introduce *_entropyInjecting_* field * *_closeAndGetHandle()_* return _*RelativeFileStateHandle*_ with relative path** base on _*checkpointDirectory*_ (except entropy injecting file system) [~yunta], [~trohrmann] , I verified this in our environment , and I will submit a pull request to accomplish this feature. Please help evaluate whether it is appropriate. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23949) first incremental checkpoint after a savepoint will degenerate into a full checkpoint
Feifan Wang created FLINK-23949: --- Summary: first incremental checkpoint after a savepoint will degenerate into a full checkpoint Key: FLINK-23949 URL: https://issues.apache.org/jira/browse/FLINK-23949 Project: Flink Issue Type: Improvement Components: Runtime / State Backends Affects Versions: 1.13.2, 1.12.5, 1.11.4 Reporter: Feifan Wang Attachments: image-2021-08-25-00-59-05-779.png In RocksIncrementalSnapshotStrategy we will record the uploaded rocksdb files corresponding to the checkpoint id,and clean it in _CheckpointListener#notifyCheckpointComplete ._ {code:java} @Override public void notifyCheckpointComplete(long completedCheckpointId) { synchronized (materializedSstFiles) { if (completedCheckpointId > lastCompletedCheckpointId) { materializedSstFiles .keySet() .removeIf(checkpointId -> checkpointId < completedCheckpointId); lastCompletedCheckpointId = completedCheckpointId; } } }{code} This works well without savepoint, but when a savepoint is completed, it will clean up the _materializedSstFiles_ of the previous checkpoint. It leads to the first checkpoint after the savepoint must upload all files in rocksdb. !image-2021-08-25-00-59-05-779.png|width=1640,height=225! Solving the problem is also very simple, I propose to change CheckpointListener#notifyCheckpointComplete to the following form : {code:java} @Override public void notifyCheckpointComplete(long completedCheckpointId) { synchronized (materializedSstFiles) { if (completedCheckpointId > lastCompletedCheckpointId && materializedSstFiles.keySet().contains(completedCheckpointId)) { materializedSstFiles .keySet() .removeIf(checkpointId -> checkpointId < completedCheckpointId); lastCompletedCheckpointId = completedCheckpointId; } } } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-21986) taskmanager native memory not release timely after restart
Feifan Wang created FLINK-21986: --- Summary: taskmanager native memory not release timely after restart Key: FLINK-21986 URL: https://issues.apache.org/jira/browse/FLINK-21986 Project: Flink Issue Type: Bug Components: Runtime / State Backends Affects Versions: 1.12.1 Environment: flink version:1.12.1 run :yarn session job type:mock source -> regular join checkpoint interval: 3m Taskmanager memory : 16G Reporter: Feifan Wang Attachments: image-2021-03-25-15-53-44-214.png, image-2021-03-25-16-07-29-083.png, image-2021-03-26-11-46-06-828.png, image-2021-03-26-11-47-21-388.png I run a regular join job with flink_1.12.1 , and find taskmanager native memory not release timely after restart cause by exceeded checkpoint tolerable failure threshold. *problem job information:* # job first restart cause by exceeded checkpoint tolerable failure threshold. # then taskmanager be killed by yarn many times # in this case,tm heap is set to 7.68G,bug all tm heap size is under 4.2G !image-2021-03-25-15-53-44-214.png|width=496,height=103! # nonheap size increase after restart,but still under 160M. !https://km.sankuai.com/api/file/cdn/706284607/716474606?contentType=1&isNewContent=false&isNewContent=false|width=493,height=102! # taskmanager process memory increase 3-4G after restart(this figure show one of taskmanager) !image-2021-03-25-16-07-29-083.png|width=493,height=107! *my guess:* [RocksDB wiki|https://github.com/facebook/rocksdb/wiki/RocksJava-Basics#memory-management] mentioned :Many of the Java Objects used in the RocksJava API will be backed by C++ objects for which the Java Objects have ownership. As C++ has no notion of automatic garbage collection for its heap in the way that Java does, we must explicitly free the memory used by the C++ objects when we are finished with them. So, is it possible that RocksDBStateBackend not call AbstractNativeReference#close() to release memory use by RocksDB C++ Object ? *I make a change:* Actively call System.gc() and System.runFinalization() every minute. *And run this test again:* # taskmanager process memory no obvious increase !image-2021-03-26-11-46-06-828.png|width=495,height=93! # job run for several days,and restart many times,but no taskmanager killed by yarn like before *Summary:* # first,there is some native memory can not release timely after restart in this situation # I guess it maybe RocksDB C++ object,but I hive not check it from source code of RocksDBStateBackend -- This message was sent by Atlassian Jira (v8.3.4#803005)