tardunge commented on PR #63938: URL: https://github.com/apache/airflow/pull/63938#issuecomment-4100284049
> I can definitely see how an operator like this can be useful, especially with airflow increasingly being used to orchestrate heavy ML workloads. But a few concerns with the current approach: > > 1. This duplicates quite a bit of lifecycle logic from `KubernetesJobOperator` (KPO), unnecessarily. I get that some duplication is unavoidable here since this is a CRD (different API + status model vs Jobs), but this feels like it goes further and reimplements the full lifecycle (polling, retry, terminal state, cleanup) that we usually centralize in Kubernetes operators. > 2. It bypasses a lot of the existing CRD methods in `KubernetesHook` by introducing bespoke logic in `ray_object_launcher.py` (which is the wrong layer btw), when `KubernetesHook` already provides `create/get/delete_custom_object` with retry and connection handling. > 3. It doesn’t support deferrable mode (which is CRITICAL for long-running Ray jobs). I understand this can be added later in a follow-up PR, but the lifecycle here is built around a blocking polling loop with tightly coupled retry/state logic. I don’t think deferrable mode can just be tacked on without reworking the execution path. > > At a high level, this feels like it’s diverging from existing Kubernetes provider patterns. I would suggest: > > * Keep the operator thin (submit + optionally wait) > * Use `KubernetesHook` directly where possible or, if needed, move the CRD interaction into a (small) hook on top of it > * Remove `ray_object_launcher.py` > * Move shared concepts like statuses into utils (similar to `utils/pod_manager.py`) - Using the k8s hook now. And added list custom object stub to the hook. - The op is deferrable now. - Removed the `ray_object_launcher.py` as, we are using the k8s hook stubs now and moved ray lifecycle related stuff to `utils/` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
