Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-23 Thread Robert Metzger
Thank you all for the great discussion. > 2. I agree with you that "completed" is not very clear, but I would suggest the name "alreadyExists". WDYT? I'm fine with "alreadyExists" and for 3. also with "backoffLimit". > really like the idea of having something like Pod Conditions, but I think

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-20 Thread Mate Czagany
Hi, Thanks for your comments, Gyula, I really appreciate it! I have updated the following things in the FLIP, please comment on these changes if you have any suggestions or concerns: - Added path field to FlinkStateSnapshotReference - Added two examples at the bottom. - Added error handling

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-19 Thread Gyula Fóra
Hey! Regarding the question of initialSavepointPath and flinkStateSnapshotReference new object, I think we could simply keep an extra field as part of the flinkStateSnapshotReference object called path. Then the fields could be: namespace, name, path If path is defined we would use that (to

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-19 Thread Mate Czagany
Hi Robert and Thomas, Thank you for sharing your thoughts, I will try to address your questions and suggestions: 1. I would really love to hear others' inputs as well about separating the snapshot CRD into two different CRDs instead for savepoints and checkpoints. I think the main upside is that

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-19 Thread Thomas Weise
Thanks for the proposal. How do you see potential effects on API server performance wrt. number of objects vs mutations? Is the proposal more or less neutral in that regard? Thanks for the thorough feedback Robert. Couple more questions below. --> On Fri, Apr 19, 2024 at 5:07 AM Robert

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-19 Thread Robert Metzger
Hi Mate, thanks for proposing this, I'm really excited about your FLIP. I hope my questions make sense to you: 1. I would like to discuss the "FlinkStateSnapshot" name and the fact that users have to use either the savepoint or checkpoint spec inside the FlinkStateSnapshot. Wouldn't it be more

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-19 Thread Gyula Fóra
Cc'ing some folks who gave positive feedback on this idea in the past. I would love to hear your thoughts on the proposed design Gyula On Tue, Apr 16, 2024 at 6:31 PM Őrhidi Mátyás wrote: > +1 Looking forward to it > > On Tue, Apr 16, 2024 at 8:56 AM Mate Czagany wrote: > > > Thank you

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-16 Thread Őrhidi Mátyás
+1 Looking forward to it On Tue, Apr 16, 2024 at 8:56 AM Mate Czagany wrote: > Thank you Gyula! > > I think that is a great idea. I have updated the Google doc to only have 1 > new configuration option of boolean type, which can be used to signal the > Operator to use the old mode. > > Also

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-16 Thread Mate Czagany
Thank you Gyula! I think that is a great idea. I have updated the Google doc to only have 1 new configuration option of boolean type, which can be used to signal the Operator to use the old mode. Also described in the configuration description, the Operator will fallback to the old mode if the

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-16 Thread Gyula Fóra
Thanks Mate, this is great stuff. Mate, I think the new configs should probably default to the new mode and they should only be useful for users to fall back to the old behaviour. We could by default use the new Snapshot CRD if the CRD is installed, otherwise use the old mode by default and log a

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-16 Thread Mate Czagany
Hi Ferenc, Thank you for your comments, I have updated the Google docs with a new section for the new configs. All of the newly added config keys will have defaults set, and by default all the savepoint/checkpoint operations will use the old system: write their results to the

Re: [DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-16 Thread Ferenc Csaky
Thank you Mate for initiating this discussion. +1 for this idea. Some Qs: Can you specify the newly introduced configurations in more details? Currently, it is not fully clear to me what are the possible values of `kubernetes.operator.periodic.savepoint.mode`, is it optional, has a default value?

[DISCUSS] FLIP-446: Kubernetes Operator State Snapshot CRD

2024-04-16 Thread Mate Czagany
Hi Everyone, I would like to start a discussion on FLIP-446: Kubernetes Operator State Snapshot CRD. This FLIP adds a new custom resource for Operator users to create and manage their savepoints and checkpoints. I have also developed an initial POC to prove that this approach is feasible, you