Let's keep this discussion for another day so everyone can still chip in. In the meantime I have opened a JIRA ticket <https://issues.apache.org/jira/browse/FLINK-26432> for this. Also I invite you to review this PR <https://github.com/apache/flink-kubernetes-operator/pull/26> , it is ready with the base changes according to the discussion.
Once we have this merged it will be easier to parallelize work on the state machine/observation/reconcile logic. Gyula On Tue, Mar 1, 2022 at 11:24 AM Gyula Fóra <gyula.f...@gmail.com> wrote: > Hi All! > > Based on your ideas and suggestions I have made some improvements to the > refactor proposal PR: > https://github.com/apache/flink-kubernetes-operator/pull/26/commits > > I have now completely decoupled the observer from the reconciler and the > JobManagerDeploymentStatus is recorded in the FlinkDeployment resource > status. This greatly simplifies the flow and eliminates the need for any > caching. > > I haven't touched the reconciliation logic here, and did not introduce a > state machine to avoid making this change even bigger. The state machine > changes could be built on top of this as a second stage of the refactoring > work after this. > > I will work on improving the tests to capture and validate the flow better. > > Let me know what you think and if you have any ideas to further improve > the high level flow! > > Thanks > Gyula > > On Tue, Mar 1, 2022 at 9:07 AM Yang Wang <danrtsey...@gmail.com> wrote: > >> The state machine is going to replace the annoying if-else in the >> reconciler. >> It seems to have no conflicts with modular mechanism(aka observer, >> reconciler, etc.). >> And we could make them happen step by step. >> >> >> Best, >> Yang >> >> >> Gyula Fóra <gyula.f...@gmail.com> 于2022年3月1日周二 13:53写道: >> >> > And just one more thing I forgot to add: >> > >> > The observer does not have to be a dummy monolithic logic. Different job >> > types will have different state machines with different observe >> > requirements and different observer implementations. I think this is >> > complexly normal. The observer can be aware of the expected state and >> what >> > it should observe on the happy path. >> > >> > Gyula >> > >> > On Tue, 1 Mar 2022 at 06:42, Gyula Fóra <gyula.f...@gmail.com> wrote: >> > >> > > @Thomas: >> > > >> > > Thanks for the input! I completely agree with a well principled state >> > > machine based approach in the operator would be the best. >> > > >> > > You are right that the code contains mostly if then else based logic >> at >> > > the moment as it evolved after initial prototype to cover more >> scenarios. >> > > >> > > However I think the self contained observer - reconciler approach is >> very >> > > capable of implementing the suggested state machine based view in a >> very >> > > elegant way. >> > > >> > > Simply put, the observer is responsible for determining and recording >> the >> > > current state and any extra information that comes with that state. >> > > >> > > The reconciler then look at the current vs desired state and execute a >> > > state transition. >> > > >> > > Gyula >> > > >> > > On Tue, 1 Mar 2022 at 02:16, Thomas Weise <t...@apache.org> wrote: >> > > >> > >> Thanks for bringing the discussion. It's a good time to revisit this >> > >> area as the operator implementation becomes more complex. >> > >> >> > >> I think the most important part of the design are well defined states >> > >> and transitions. Unless that can be reasoned about, there will be >> > >> confusion. For example: >> > >> >> > >> 1) For job upgrade, it wasn't clear in which state reconciliation >> > >> should happen and as a result the implementation ended up incomplete. >> > >> 2) "JobStatusObserver" would attempt to list jobs before the REST API >> > >> is ready. And it would be used in session mode, even though the >> > >> session mode needs a different state machine. >> > >> >> > >> The implementation currently contains a lot of if-then-else logic, >> > >> which is hard to follow and difficult to maintain. It will make it >> > >> harder to introduce new states that would be necessary to implement >> > >> more advanced upgrade strategies, amongst others. >> > >> >> > >> Did you consider a state machine centric approach? See [1] for >> example. >> > >> >> > >> As Biao mentioned, observe -> reconcile may repeat and different >> > >> states will require different checks. Once a job (in job mode) is >> > >> running, there is no need to check the JM deployment etc. A >> monolithic >> > >> observer may not work so well for that. Rather, I think different >> > >> states have different monitoring needs that inform transitions, >> > >> including the actual state and changes made to the CR. >> > >> >> > >> It would also be good if the current state is directly reflected in >> > >> the CR status so that it is easier to check where the deployment is >> > >> at. >> > >> >> > >> Cheers, >> > >> Thomas >> > >> >> > >> >> > >> [1] >> > >> >> > >> https://github.com/lyft/flinkk8soperator/blob/master/docs/state_machine.md >> > >> >> > >> On Mon, Feb 28, 2022 at 8:16 AM Gyula Fóra <gyula.f...@gmail.com> >> > wrote: >> > >> > >> > >> > Hi All! >> > >> > >> > >> > Thank you for the feedback, I agree with what has been proposed to >> > >> include >> > >> > as much as possible in the actual resource status and make the >> > >> reconciler >> > >> > completely independent from the observer. >> > >> > >> > >> > @Biao Geng: >> > >> > Validation could depend on the current status (depending on how we >> > >> > implement the validation logic) so it might always be necessary >> (and >> > it >> > >> is >> > >> > also cheap). >> > >> > What you are saying with multiple observe -> reconcile cycles ties >> > back >> > >> to >> > >> > what Matyas said, that we should probably have an Observe loop >> until >> > we >> > >> > have a stable state ready for reconciliation, then reconcile once. >> > >> > >> > >> > So probably validate -> observe until stable -> reconcile -> >> observe >> > >> until >> > >> > stable >> > >> > >> > >> > Cheers, >> > >> > Gyula >> > >> > >> > >> > On Mon, Feb 28, 2022 at 4:49 PM Biao Geng <biaoge...@gmail.com> >> > wrote: >> > >> > >> > >> > > Hi Gyula, >> > >> > > Thanks for the discussion. It also makes senses to me on the >> > >> separation of >> > >> > > 3 components and Yang's proposal. >> > >> > > Just 1 follow-up thought after checking your PR: in the >> reconcile() >> > >> > > method of controller, IIUC, the real flow could be >> > >> > > `validate->observe->reconcile->validate->observe->reconcile...". >> The >> > >> > > validation phase seems to be required only when the creation of >> the >> > >> job >> > >> > > cluster and the upgrade of config. For phases like waiting the JM >> > from >> > >> > > deploying to ready, it is not mandatory and thus the flow can >> look >> > >> like >> > >> > > `validate->observe->reconcile->optional validate due to current >> > >> > > state->observe->reconcile...` >> > >> > > >> > >> > > Őrhidi Mátyás <matyas.orh...@gmail.com> 于2022年2月28日周一 21:26写道: >> > >> > > >> > >> > > > It is worth looking at the controller code in the spotify >> operator >> > >> too: >> > >> > > > >> > >> > > > >> > >> > > >> > >> >> > >> https://github.com/spotify/flink-on-k8s-operator/blob/master/controllers/flinkcluster/flinkcluster_controller.go >> > >> > > > >> > >> > > > It is looping in the 'observer phase' until it reaches a stable >> > >> state, >> > >> > > then >> > >> > > > it performs the necessary changes. >> > >> > > > >> > >> > > > Based on this I also suggest keeping the logic in separate >> > >> > > > modules(Validate->Observe->Reconcile). The three components >> might >> > >> not >> > >> > > > even be enough as we add more and more complexity to the code. >> > >> > > > >> > >> > > > Cheers, >> > >> > > > Matyas >> > >> > > > >> > >> > > > >> > >> > > > On Mon, Feb 28, 2022 at 2:03 PM Aitozi <gjying1...@gmail.com> >> > >> wrote: >> > >> > > > >> > >> > > > > Hi, Gyula >> > >> > > > > Thanks for driving this discussion. I second Yang >> Wang's >> > >> idea >> > >> > > that >> > >> > > > > it's better to make the `validator`, `observer` and >> `reconciler` >> > >> > > > > self-contained. I also prefer to define the `Observer` as an >> > >> interface >> > >> > > > and >> > >> > > > > we could define the statuses that `Observer` will expose. It >> > acts >> > >> like >> > >> > > > the >> > >> > > > > observer protocol between the `Observer` and `Reconciler`. >> > >> > > > > >> > >> > > > > Best, >> > >> > > > > Aitozi. >> > >> > > > > >> > >> > > > > Yang Wang <danrtsey...@gmail.com> 于2022年2月28日周一 16:28写道: >> > >> > > > > >> > >> > > > > > Thanks for posting the discussion here. >> > >> > > > > > >> > >> > > > > > >> > >> > > > > > Having the components `validator` `observer` `reconciler` >> > makes >> > >> lots >> > >> > > of >> > >> > > > > > sense. And the "Validate -> Observe -> Reconcile" >> > >> > > > > > flow seems natural to me. >> > >> > > > > > >> > >> > > > > > Regarding the implementation in the PR, instead of directly >> > >> using the >> > >> > > > > > observer in the reconciler, I lean to let the observer >> > >> > > > > > exports the results to the status(e.g. jobmanager >> deployment >> > >> status, >> > >> > > > rest >> > >> > > > > > port readiness, flink jobs status, etc.) and >> > >> > > > > > the reconciler reads it from the status. Then each >> component >> > is >> > >> more >> > >> > > > > > self-contained and the boundary will be clearer. >> > >> > > > > > >> > >> > > > > > >> > >> > > > > > Best, >> > >> > > > > > Yang >> > >> > > > > > >> > >> > > > > > Gyula Fóra <gyf...@apache.org> 于2022年2月28日周一 16:01写道: >> > >> > > > > > >> > >> > > > > > > Hi All! >> > >> > > > > > > >> > >> > > > > > > I would like to start a discussion thread regarding the >> > >> structure >> > >> > > of >> > >> > > > > > > the Kubernetes >> > >> > > > > > > Operator >> > >> > > > > > > < >> > >> > > > > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> >> > >> https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkDeploymentController.java >> > >> > > > > > > > >> > >> > > > > > > controller >> > >> > > > > > > flow. Based on some recent PR discussions we have no >> clear >> > >> > > consensus >> > >> > > > on >> > >> > > > > > the >> > >> > > > > > > structure and the expectations which can potentially >> lead to >> > >> back >> > >> > > and >> > >> > > > > > forth >> > >> > > > > > > changes and unnecessary complexity. >> > >> > > > > > > >> > >> > > > > > > *Background* >> > >> > > > > > > In the initial prototype we had a very basic flow: >> > >> > > > > > > 1. Observe flink job status >> > >> > > > > > > 2. (if observation successful) reconcile changes >> > >> > > > > > > 3. Reschedule reconcile with success/error >> > >> > > > > > > >> > >> > > > > > > This basic prototype flow could not cover all >> requirements >> > >> and did >> > >> > > > not >> > >> > > > > > > allow for things like waiting until Jobmanager >> deployment is >> > >> ready >> > >> > > > etc. >> > >> > > > > > > >> > >> > > > > > > To solve these shortcomings, some changes were introduced >> > >> recently >> > >> > > > here >> > >> > > > > > > < >> > https://github.com/apache/flink-kubernetes-operator/pull/21 >> > >> >. >> > >> > > While >> > >> > > > > > this >> > >> > > > > > > change introduced many improvements and safeguards it >> also >> > >> > > completely >> > >> > > > > > > changed the original controller flow. Now the reconciler >> is >> > >> > > > responsible >> > >> > > > > > for >> > >> > > > > > > ensuring that it can actually reconcile by checking the >> > >> deployment >> > >> > > > and >> > >> > > > > > > ports. The job status observation logic has also been >> moved >> > >> into >> > >> > > the >> > >> > > > > > actual >> > >> > > > > > > reconcile logic. >> > >> > > > > > > >> > >> > > > > > > >> > >> > > > > > > *Discussion Question*What controller flow would we like >> to >> > >> have? Do >> > >> > > > we >> > >> > > > > > want >> > >> > > > > > > to separate the observer from the reconciler or keep them >> > >> together? >> > >> > > > > > > >> > >> > > > > > > In my personal view, we should try to adopt a very simple >> > >> flow to >> > >> > > > make >> > >> > > > > > the >> > >> > > > > > > operator clean and modular. If possible I would like to >> > >> restore the >> > >> > > > > > > original flow with some modifications: >> > >> > > > > > > >> > >> > > > > > > 1. Validate deployment object >> > >> > > > > > > 2. Observe deployment and flink job status -> Return >> > >> comprehensive >> > >> > > > > > status >> > >> > > > > > > info >> > >> > > > > > > 3. Reconcile deployment based on observed status and >> > resource >> > >> > > > changes >> > >> > > > > > > (Both 2/3 should be able to reschedule immediately if >> > >> necessary) >> > >> > > > > > > >> > >> > > > > > > I think the Observer component should be able to describe >> > the >> > >> > > current >> > >> > > > > > > status of the deployment objects and the flink job to the >> > >> extent >> > >> > > that >> > >> > > > > the >> > >> > > > > > > reconciler can work with that information alone. If we >> do it >> > >> this >> > >> > > > way, >> > >> > > > > we >> > >> > > > > > > can also use the status information that the observer >> > >> provides to >> > >> > > > > produce >> > >> > > > > > > other events and aid operations like shutdown which >> depend >> > on >> > >> the >> > >> > > > > current >> > >> > > > > > > deployment status. >> > >> > > > > > > >> > >> > > > > > > I think this would satisfy our needs, but I might be >> missing >> > >> > > > something >> > >> > > > > > that >> > >> > > > > > > cannot be done if we structure the code this way. >> > >> > > > > > > >> > >> > > > > > > I have a PR open >> > >> > > > > > > < >> > >> > > >> https://github.com/apache/flink-kubernetes-operator/pull/26/commits >> > >> > > > > >> > >> > > > > > > which >> > >> > > > > > > includes some of these proposed changes (as the optional >> > >> second >> > >> > > > commit) >> > >> > > > > > so >> > >> > > > > > > that you can easily compare with the current state of the >> > >> operator. >> > >> > > > > > > >> > >> > > > > > > Please let us know what we think! >> > >> > > > > > > >> > >> > > > > > > Cheers, >> > >> > > > > > > Gyula >> > >> > > > > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> >> > > >> > >> >