kevinalh opened a new issue, #38006:
URL: https://github.com/apache/airflow/issues/38006

   ### Apache Airflow version
   
   2.8.2
   
   ### If "Other Airflow 2 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   Was doing an upgrade from Airflow 2.2.2 to 2.8.1, and encountered the error 
`ValueError: non-default argument follows default argument` on my TaskFlow 
decorated operators. Had to go through the Airflow code to understand what was 
going on.
   
   The general shape of my tasks is:
   
   ```python
   @task(multiple_outputs=True)
   def decorated_task(
       inputs: list[dict],
       logical_date: str,
       project_id: str,
       bucket_name: str,
   ) -> dict[str, str]:
       # Implementation details
       return {}
   ```
   
   called from a DAG like:
   
   ```python
   @dag(
       dag_id=dag_id,
       # ...
       default_args={
           "depends_on_past": False,
           "retries": 3,
           "retry_delay": timedelta(hours=1),
       },
   )
   def decorated_dag():
       # ...
       task_date = "{{ (dag_run.logical_date - macros.timedelta(days=1)) | 
ds_nodash }}"
       # ...
       decorated_task_result = decorated_task(
           inputs=inputs,
           logical_date=task_date,
           project_id=PROJECT_ID,
           bucket_name=BUCKET_NAME,
       )
   ```
   
   The error comes with a stack trace like follows:
   
   ```
   File 
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/dag.py", 
line 3944, in factory
       f(**f_kwargs)
     File "/usr/local/airflow/dags/dag_file.py", line 160, in decorated_dag
       decorated_task_result = decorated_task(
                               ^^^^^^^^^^^^^^
     File 
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/decorators/base.py",
 line 366, in __call__
       op = self.operator_class(
            ^^^^^^^^^^^^^^^^^^^^
     File 
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/baseoperator.py",
 line 437, in apply_defaults
       result = func(self, **kwargs, default_args=default_args)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/decorators/python.py",
 line 52, in __init__
       super().__init__(
     File 
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/baseoperator.py",
 line 437, in apply_defaults
       result = func(self, **kwargs, default_args=default_args)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/decorators/base.py",
 line 217, in __init__
       signature = signature.replace(parameters=parameters)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/usr/local/lib/python3.11/inspect.py", line 3052, in replace
       return type(self)(parameters,
              ^^^^^^^^^^^^^^^^^^^^^^
     File "/usr/local/lib/python3.11/inspect.py", line 3008, in __init__
       raise ValueError(msg)
   ValueError: non-default argument follows default argument
   ```
   
   The problem also happens with 2.8.2.
   
   As the trace mentions, the problem is in this part of the code of the 
`DecoratedOperator` class in `decorators/base.py`:
   
   ```python
   # Check the decorated function's signature. We go through the argument
   # list and "fill in" defaults to arguments that are known context keys,
   # since values for those will be provided when the task is run. Since
   # we're not actually running the function, None is good enough here.
   signature = inspect.signature(python_callable)
   parameters = [
       param.replace(default=None) if param.name in KNOWN_CONTEXT_KEYS else 
param
       for param in signature.parameters.values()
   ]
   signature = signature.replace(parameters=parameters)
   ```
   
   Since the `logical_date` parameter has the name of a known context key, 
giving it the `None` default when the rest of the parameters don't have a 
default generates the error.
   
   ### What you think should happen instead?
   
   Two options come to mind:
   
   1. Make the error message clearer so that it's explicitly mentioned that 
context variables should go _after_ other parameters in your TaskFlow function 
signatures. We should also include this in the documentation.
   2. Allow for this behavior by changing code. We could simply do this by 
making the known context parameters with the `None` default go to the end.
   
   ### How to reproduce
   
   Doing a brand-new local installation of Airflow with
   
   ```
   mkdir test_airflow
   cd test_airflow
   python -m venv ./.venv
   source ./.venv/bin/activate.fish
   
   pip install "apache-airflow==2.8.2" --constraint 
"https://raw.githubusercontent.com/apache/airflow/constraints-2.8.2/constraints-3.11.txt";
   ```
   
   and adding a simple DAG:
   
   ```python
   from airflow.decorators import task, dag
   
   
   @task
   def run_task(logical_date: str, input: str) -> str:
       return f"{logical_date}, {input}"
   
   
   @dag(tags=["test"])
   def run_dag():
       run_task(input="test_input", logical_date="{{ logical_date }}")
   
   
   run_dag()
   ```
   
   reproduces the issue. Putting `logical_date` as the second argument of 
`run_task` fixes it, as expected.
   
   ### Operating System
   
   macOS Sonoma 14.3.1
   
   ### Versions of Apache Airflow Providers
   
   Only the default ones, this isn't relevant.
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   Was able to reproduce it with a minimal local virtualenv installation, but 
originally found the issue in the MWAA local runner.
   
   ### Anything else?
   
   After figuring it out I posted about it on StackOverflow in case someone 
else encounters the problem: 
https://stackoverflow.com/questions/78131051/valueerror-non-default-argument-follows-default-argument-when-upgrading-from
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to