gbloisi-openaire opened a new issue, #38017:
URL: https://github.com/apache/airflow/issues/38017

   ### Apache Airflow version
   
   2.8.2
   
   ### If "Other Airflow 2 version" selected, which one?
   
   2.8.3rc1
   
   ### What happened?
   
   I'm running a spark-pi example using the SparkKubernetesOperator:
   `submit = SparkKubernetesOperator(
       task_id='spark_pi_submit',
       namespace='lot1-spark-jobs',
       application_file="/example_spark_kubernetes_operator_pi.yaml",
       kubernetes_conn_id="kubernetes_default",
       do_xcom_push=True,
       in_cluster=True,
       delete_on_termination=True,
       dag=dag
   )`
   
   It was running fine on 2.8.1. After upgrading to airflow 2.8.2 I got the 
following error:
   `[2024-03-10T10:29:15.607+0000] {taskinstance.py:2731} ERROR - Task failed 
with exception                                                                  
                                ││ Traceback (most recent call last):           
                                                                                
                                                             ││   File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py",
 line 444, in _execute_task                                                     
                ││     result = _execute_callable(context=context, 
**execute_callable_kwargs)                                                      
                                                          ││              
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                   
                                                                                
             ││   File "/home/airflow/.local/
 lib/python3.11/site-packages/airflow/models/taskinstance.py", line 414, in 
_execute_callable                                                               
  ││     return execute_callable(context=context, **execute_callable_kwargs)    
                                                                                
                               ││            
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                    
                                                                                
               ││   File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py",
 line 261, in execute                                          │
   │     kube_client=self.client,                                               
                                                                                
                               │
   │                 ^^^^^^^^^^^                                                
                                                                                
                               │
   │   File "/usr/local/lib/python3.11/functools.py", line 1001, in __get__     
                                                                                
                               │
   │     val = self.func(instance)                                              
                                                                                
                               │
   │           ^^^^^^^^^^^^^^^^^^^                                              
                                                                                
                               │
   │   File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py",
 line 250, in client                                           │
   │     return self.hook.core_v1_client                                        
                                                                                
                               │
   │            ^^^^^^^^^                                                       
                                                                                
                               │
   │   File "/usr/local/lib/python3.11/functools.py", line 1001, in __get__     
                                                                                
                               │
   │     val = self.func(instance)                                              
                                                                                
                               │
   │           ^^^^^^^^^^^^^^^^^^^                                              
                                                                                
                               │
   │   File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py",
 line 242, in hook                                             │
   │     or self.template_body.get("kubernetes", {}).get("kube_config_file", 
None),                                                                          
                                  │
   │        ^^^^^^^^^^^^^^^^^^                                                  
                                                                                
                               │
   │   File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py",
 line 198, in template_body                                    │
   │     return self.manage_template_specs()                                    
                                                                                
                               │
   │            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                    
                                                                                
                               │
   │   File 
"/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py",
 line 127, in manage_template_specs                            │
   │     template_body = _load_body_to_dict(open(self.application_file))        
                                                                                
                               │
   │                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^         
                                                                                
                               │
   │ FileNotFoundError: [Errno 2] No such file or directory: 'apiVersion: 
"sparkoperator.k8s.io/v1beta2"\nkind: SparkApplication\nmetadata:\n  name: 
spark-pi\n  namespace: lot1-spark-jobs\ns │
   │ [2024-03-10T10:29:15.613+0000] {taskinstance.py:1149} INFO - Marking task 
as UP_FOR_RETRY. dag_id=spark_pi, task_id=spark_pi_submit, 
execution_date=20240310T102910, start_date=20240310T │
   `
   It looks like  self.application_file eventually contains the content of the 
file it point to. 
   
   I suspect it was caused by changes introduced by 
[PR-22253](https://github.com/apache/airflow/pull/22253). I'm quite new to 
Airflow and Python but my guess is that "application_file" property hasn't to 
be managed as a 
[template_property](https://github.com/apache/airflow/blob/93dfdf0d7fc7641b1ddc5cbc881d93893fd2774c/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py#L69)
 since template representations where moved to template_body.
   
   
   
   ### What you think should happen instead?
   
   _No response_
   
   ### How to reproduce
   
   Given my understanding of the issue, a very simple example of 
SparkKubernetesOperator using application_file property should reproduce this 
issue.
   
   ### Operating System
   
   kind kubernetes
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to