SameerMesiah97 opened a new issue, #61829:
URL: https://github.com/apache/airflow/issues/61829

   ### Apache Airflow Provider(s)
   
   google
   
   ### Versions of Apache Airflow Providers
   
   `apache-airflow-providers-google==20.0.0rc1`
   
   ### Apache Airflow version
   
   main
   
   ### Operating System
   
   Debian GNU/Linux 12 (bookworm)
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   `ComputeEngineInsertInstanceOperator` currently treats the mere presence of 
a Compute Engine instance as success. When an instance already exists, the 
operator logs that it exists and immediately returns success without verifying 
whether the existing resource matches the requested configuration. As a result, 
configuration differences are not detected on subsequent DAG runs—particularly 
changes to critical fields such as `machine_type` or boot disk settings. If 
these values are modified in the DAG and the task is re-run, the operator 
neither recreates nor updates the instance; it simply succeeds silently. This 
leads to incorrect idempotence semantics and allows infrastructure drift to 
persist undetected across DAG executions.
   
   
   ### What you think should happen instead
   
   The operator should verify that the existing Compute Engine instance matches 
the configuration defined in the DAG rather than relying solely on 
presence-based idempotence. On subsequent DAG runs, it should detect 
differences in critical configuration fields—such as `machine_type` and disk 
parameters—and surface those differences explicitly. At minimum, configuration 
drift should be logged so that users are aware of the mismatch. Ideally, the 
operator should support reconciling the resource to the declared state, for 
example by recreating the instance when differences are detected and an 
explicit flag is set. This would ensure consistent, declarative behavior across 
DAG re-runs and prevent silent infrastructure drift.
   
   ### How to reproduce
   
   1. Configure a Google Cloud connection in Airflow (e.g. 
`google_cloud_default`) with a service account that has permission to create 
and manage Compute Engine instances.
   2. Ensure you have a valid GCP project ID and zone (for example, 
`us-central1-a`).
   3. Create the following DAG:
   ```
   from airflow import DAG
   from datetime import datetime
   from airflow.providers.google.cloud.operators.compute import (
       ComputeEngineInsertInstanceOperator,
   )
   
   PROJECT_ID = "<YOUR_PROJECT_ID>"
   ZONE = "us-central1-a"
   INSTANCE_NAME = "airflow-idempotence-test"
   
   with DAG(
       dag_id="gce_idempotence_repro",
       start_date=datetime(2025, 1, 1),
       schedule=None,
       catchup=False,
   ) as dag:
   
       create_instance = ComputeEngineInsertInstanceOperator(
           task_id="create_instance",
           project_id=PROJECT_ID,
           zone=ZONE,
           body={
               "name": INSTANCE_NAME,
               "machine_type": f"zones/{ZONE}/machineTypes/e2-medium",  # 
Initial machine type used in this repro
               "disks": [
                   {
                       "boot": True,
                       "auto_delete": True,
                       "initialize_params": {
                           "source_image": 
"projects/debian-cloud/global/images/family/debian-11"
                       },
                   }
               ],
               "network_interfaces": [
                   {
                       "network": "global/networks/default"
                   }
               ],
           },
       )
   ```
   4. Trigger the DAG once and confirm that the instance is created 
successfully.
   5. Update the machine type:
   
       `"machine_type": f"zones/{ZONE}/machineTypes/n2-standard-4",`
        (Any valid machine type different from the original may be used; in 
this repro, `n2-standard-4` is used.)
   
   6. Trigger the DAG again.
   
   **Observed Behavior**
   
   The task logs that the instance already exists and completes successfully. 
However, the instance configuration remains unchanged (e.g., the machine type 
stays `e2-medium`), and no configuration differences are detected or logged. 
This shows that changes to fields such as `machine_type` (and disk 
configuration) are not recognized on DAG re-run, and the operator does not 
reconcile the resource to the requested state.
   
   
   ### Anything else
   
   **This report is not proposing destructive behavior by default**. 
Automatically deleting and recreating instances when differences are detected 
may not be appropriate for all users or environments. However, at a minimum, 
configuration differences should be surfaced in logs. Any reconciliation 
behavior (such as recreation) should be explicitly opt-in, allowing users to 
choose stronger convergence semantics when desired without changing existing 
default behavior.
   
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to