jbbqqf commented on issue #17089: URL: https://github.com/apache/airflow/issues/17089#issuecomment-4515863250
Hi! Triaging older issues — I think this one may be answerable with the existing API, and could potentially be closed. The use case described (loading a YAML-exported Dataproc cluster config and feeding it to `DataprocCreateClusterOperator`) is already supported via the `cluster_config` parameter, which accepts a `dict` shaped like the `google.cloud.dataproc_v1.types.ClusterConfig` protobuf message: - [`providers/google/src/airflow/providers/google/cloud/operators/dataproc.py#L602-L605`](https://github.com/apache/airflow/blob/main/providers/google/src/airflow/providers/google/cloud/operators/dataproc.py#L602-L605) — docstring: *"If a dict is provided, it must be of the same form as the protobuf message ClusterConfig"*. - [`providers/google/src/airflow/providers/google/cloud/operators/dataproc.py#L656`](https://github.com/apache/airflow/blob/main/providers/google/src/airflow/providers/google/cloud/operators/dataproc.py#L656) — signature: `cluster_config: dict | Cluster | None = None`. - The original reporter [acknowledged this back in 2021](https://github.com/apache/airflow/issues/17089#issuecomment-883091195) (*"Got it, I got confused with the Operator name earlier"*) after @mik-laj pointed out `cluster_config` was the recommended path. In practice this means a `gcloud dataproc clusters export ... > cfg.yaml` output can be loaded with `yaml.safe_load(...)` in the DAG file and passed directly to `cluster_config=...` — no dedicated YAML-loading operator needed. If I'm missing something (a YAML-specific feature beyond what `dict`-via-`yaml.safe_load` covers, or a deprecation I haven't spotted), happy to dig further. Otherwise this looks closeable. (Disclosure: I drafted this comment with help from Claude Code while triaging stale issues; the references above were verified manually against the current `main` branch.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
