And just one more thing I forgot to add: The observer does not have to be a dummy monolithic logic. Different job types will have different state machines with different observe requirements and different observer implementations. I think this is complexly normal. The observer can be aware of the expected state and what it should observe on the happy path.
Gyula On Tue, 1 Mar 2022 at 06:42, Gyula Fóra <[email protected]> wrote: > @Thomas: > > Thanks for the input! I completely agree with a well principled state > machine based approach in the operator would be the best. > > You are right that the code contains mostly if then else based logic at > the moment as it evolved after initial prototype to cover more scenarios. > > However I think the self contained observer - reconciler approach is very > capable of implementing the suggested state machine based view in a very > elegant way. > > Simply put, the observer is responsible for determining and recording the > current state and any extra information that comes with that state. > > The reconciler then look at the current vs desired state and execute a > state transition. > > Gyula > > On Tue, 1 Mar 2022 at 02:16, Thomas Weise <[email protected]> wrote: > >> Thanks for bringing the discussion. It's a good time to revisit this >> area as the operator implementation becomes more complex. >> >> I think the most important part of the design are well defined states >> and transitions. Unless that can be reasoned about, there will be >> confusion. For example: >> >> 1) For job upgrade, it wasn't clear in which state reconciliation >> should happen and as a result the implementation ended up incomplete. >> 2) "JobStatusObserver" would attempt to list jobs before the REST API >> is ready. And it would be used in session mode, even though the >> session mode needs a different state machine. >> >> The implementation currently contains a lot of if-then-else logic, >> which is hard to follow and difficult to maintain. It will make it >> harder to introduce new states that would be necessary to implement >> more advanced upgrade strategies, amongst others. >> >> Did you consider a state machine centric approach? See [1] for example. >> >> As Biao mentioned, observe -> reconcile may repeat and different >> states will require different checks. Once a job (in job mode) is >> running, there is no need to check the JM deployment etc. A monolithic >> observer may not work so well for that. Rather, I think different >> states have different monitoring needs that inform transitions, >> including the actual state and changes made to the CR. >> >> It would also be good if the current state is directly reflected in >> the CR status so that it is easier to check where the deployment is >> at. >> >> Cheers, >> Thomas >> >> >> [1] >> https://github.com/lyft/flinkk8soperator/blob/master/docs/state_machine.md >> >> On Mon, Feb 28, 2022 at 8:16 AM Gyula Fóra <[email protected]> wrote: >> > >> > Hi All! >> > >> > Thank you for the feedback, I agree with what has been proposed to >> include >> > as much as possible in the actual resource status and make the >> reconciler >> > completely independent from the observer. >> > >> > @Biao Geng: >> > Validation could depend on the current status (depending on how we >> > implement the validation logic) so it might always be necessary (and it >> is >> > also cheap). >> > What you are saying with multiple observe -> reconcile cycles ties back >> to >> > what Matyas said, that we should probably have an Observe loop until we >> > have a stable state ready for reconciliation, then reconcile once. >> > >> > So probably validate -> observe until stable -> reconcile -> observe >> until >> > stable >> > >> > Cheers, >> > Gyula >> > >> > On Mon, Feb 28, 2022 at 4:49 PM Biao Geng <[email protected]> wrote: >> > >> > > Hi Gyula, >> > > Thanks for the discussion. It also makes senses to me on the >> separation of >> > > 3 components and Yang's proposal. >> > > Just 1 follow-up thought after checking your PR: in the reconcile() >> > > method of controller, IIUC, the real flow could be >> > > `validate->observe->reconcile->validate->observe->reconcile...". The >> > > validation phase seems to be required only when the creation of the >> job >> > > cluster and the upgrade of config. For phases like waiting the JM from >> > > deploying to ready, it is not mandatory and thus the flow can look >> like >> > > `validate->observe->reconcile->optional validate due to current >> > > state->observe->reconcile...` >> > > >> > > Őrhidi Mátyás <[email protected]> 于2022年2月28日周一 21:26写道: >> > > >> > > > It is worth looking at the controller code in the spotify operator >> too: >> > > > >> > > > >> > > >> https://github.com/spotify/flink-on-k8s-operator/blob/master/controllers/flinkcluster/flinkcluster_controller.go >> > > > >> > > > It is looping in the 'observer phase' until it reaches a stable >> state, >> > > then >> > > > it performs the necessary changes. >> > > > >> > > > Based on this I also suggest keeping the logic in separate >> > > > modules(Validate->Observe->Reconcile). The three components might >> not >> > > > even be enough as we add more and more complexity to the code. >> > > > >> > > > Cheers, >> > > > Matyas >> > > > >> > > > >> > > > On Mon, Feb 28, 2022 at 2:03 PM Aitozi <[email protected]> >> wrote: >> > > > >> > > > > Hi, Gyula >> > > > > Thanks for driving this discussion. I second Yang Wang's >> idea >> > > that >> > > > > it's better to make the `validator`, `observer` and `reconciler` >> > > > > self-contained. I also prefer to define the `Observer` as an >> interface >> > > > and >> > > > > we could define the statuses that `Observer` will expose. It acts >> like >> > > > the >> > > > > observer protocol between the `Observer` and `Reconciler`. >> > > > > >> > > > > Best, >> > > > > Aitozi. >> > > > > >> > > > > Yang Wang <[email protected]> 于2022年2月28日周一 16:28写道: >> > > > > >> > > > > > Thanks for posting the discussion here. >> > > > > > >> > > > > > >> > > > > > Having the components `validator` `observer` `reconciler` makes >> lots >> > > of >> > > > > > sense. And the "Validate -> Observe -> Reconcile" >> > > > > > flow seems natural to me. >> > > > > > >> > > > > > Regarding the implementation in the PR, instead of directly >> using the >> > > > > > observer in the reconciler, I lean to let the observer >> > > > > > exports the results to the status(e.g. jobmanager deployment >> status, >> > > > rest >> > > > > > port readiness, flink jobs status, etc.) and >> > > > > > the reconciler reads it from the status. Then each component is >> more >> > > > > > self-contained and the boundary will be clearer. >> > > > > > >> > > > > > >> > > > > > Best, >> > > > > > Yang >> > > > > > >> > > > > > Gyula Fóra <[email protected]> 于2022年2月28日周一 16:01写道: >> > > > > > >> > > > > > > Hi All! >> > > > > > > >> > > > > > > I would like to start a discussion thread regarding the >> structure >> > > of >> > > > > > > the Kubernetes >> > > > > > > Operator >> > > > > > > < >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkDeploymentController.java >> > > > > > > > >> > > > > > > controller >> > > > > > > flow. Based on some recent PR discussions we have no clear >> > > consensus >> > > > on >> > > > > > the >> > > > > > > structure and the expectations which can potentially lead to >> back >> > > and >> > > > > > forth >> > > > > > > changes and unnecessary complexity. >> > > > > > > >> > > > > > > *Background* >> > > > > > > In the initial prototype we had a very basic flow: >> > > > > > > 1. Observe flink job status >> > > > > > > 2. (if observation successful) reconcile changes >> > > > > > > 3. Reschedule reconcile with success/error >> > > > > > > >> > > > > > > This basic prototype flow could not cover all requirements >> and did >> > > > not >> > > > > > > allow for things like waiting until Jobmanager deployment is >> ready >> > > > etc. >> > > > > > > >> > > > > > > To solve these shortcomings, some changes were introduced >> recently >> > > > here >> > > > > > > <https://github.com/apache/flink-kubernetes-operator/pull/21 >> >. >> > > While >> > > > > > this >> > > > > > > change introduced many improvements and safeguards it also >> > > completely >> > > > > > > changed the original controller flow. Now the reconciler is >> > > > responsible >> > > > > > for >> > > > > > > ensuring that it can actually reconcile by checking the >> deployment >> > > > and >> > > > > > > ports. The job status observation logic has also been moved >> into >> > > the >> > > > > > actual >> > > > > > > reconcile logic. >> > > > > > > >> > > > > > > >> > > > > > > *Discussion Question*What controller flow would we like to >> have? Do >> > > > we >> > > > > > want >> > > > > > > to separate the observer from the reconciler or keep them >> together? >> > > > > > > >> > > > > > > In my personal view, we should try to adopt a very simple >> flow to >> > > > make >> > > > > > the >> > > > > > > operator clean and modular. If possible I would like to >> restore the >> > > > > > > original flow with some modifications: >> > > > > > > >> > > > > > > 1. Validate deployment object >> > > > > > > 2. Observe deployment and flink job status -> Return >> comprehensive >> > > > > > status >> > > > > > > info >> > > > > > > 3. Reconcile deployment based on observed status and resource >> > > > changes >> > > > > > > (Both 2/3 should be able to reschedule immediately if >> necessary) >> > > > > > > >> > > > > > > I think the Observer component should be able to describe the >> > > current >> > > > > > > status of the deployment objects and the flink job to the >> extent >> > > that >> > > > > the >> > > > > > > reconciler can work with that information alone. If we do it >> this >> > > > way, >> > > > > we >> > > > > > > can also use the status information that the observer >> provides to >> > > > > produce >> > > > > > > other events and aid operations like shutdown which depend on >> the >> > > > > current >> > > > > > > deployment status. >> > > > > > > >> > > > > > > I think this would satisfy our needs, but I might be missing >> > > > something >> > > > > > that >> > > > > > > cannot be done if we structure the code this way. >> > > > > > > >> > > > > > > I have a PR open >> > > > > > > < >> > > https://github.com/apache/flink-kubernetes-operator/pull/26/commits >> > > > > >> > > > > > > which >> > > > > > > includes some of these proposed changes (as the optional >> second >> > > > commit) >> > > > > > so >> > > > > > > that you can easily compare with the current state of the >> operator. >> > > > > > > >> > > > > > > Please let us know what we think! >> > > > > > > >> > > > > > > Cheers, >> > > > > > > Gyula >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> >
