rajrao opened a new issue, #38611:
URL: https://github.com/apache/airflow/issues/38611

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### If "Other Airflow 2 version" selected, which one?
   
   2.6.3
   
   ### What happened?
   
   After upgrading to MWAA 2.6.3, we have found that some of the logs that we 
used to get previous suddenly stopped showing up. We believe we have isolated 
it to calling sql hook methods that execute a query, which then causes task 
logs to stop showing up.
   
   This the logs we expect:
   ```
   ....
   [2024-03-28, 15:58:10 MDT] {{standard_task_runner.py:84}} INFO - Running: 
['airflow', 'tasks', 'run', 'test_pg_query_dag', 'run_pg_query', 
'manual__2024-03-28T21:58:05.024721+00:00', '--job-id', '6187557', '--raw', 
'--subdir', 'DAGS_FOLDER/airflow_test_pg_query_dag.py', '--cfg-path', 
'/tmp/tmpmu1hue6m']
   [2024-03-28, 15:58:10 MDT] {{standard_task_runner.py:85}} INFO - Job 
6187557: Subtask run_pg_query
   [2024-03-28, 15:58:11 MDT] {{taskinstance.py:1545}} INFO - Exporting env 
vars: AIRFLOW_CTX_DAG_OWNER='airflow' AIRFLOW_CTX_DAG_ID='test_pg_query_dag' 
AIRFLOW_CTX_TASK_ID='run_pg_query' 
AIRFLOW_CTX_EXECUTION_DATE='2024-03-28T21:58:05.024721+00:00' 
AIRFLOW_CTX_TRY_NUMBER='1' 
AIRFLOW_CTX_DAG_RUN_ID='manual__2024-03-28T21:58:05.024721+00:00'
   [2024-03-28, 15:58:11 MDT] {{logging_mixin.py:150}} INFO - start run_pg_query
   [2024-03-28, 15:58:11 MDT] {{sql.py:374}} INFO - Running statement: SELECT 
now(), parameters: None
   [2024-03-28, 15:58:11 MDT] {{sql.py:383}} INFO - Rows affected: 1
   [2024-03-28, 15:58:11 MDT] {{logging_mixin.py:150}} INFO - data returned: 
2024-03-28 21:58:11.648106+00:00
   [2024-03-28, 15:58:11 MDT] {{logging_mixin.py:150}} INFO - end run_pg_query
   [2024-03-28, 15:58:11 MDT] {{python.py:183}} INFO - Done. Returned value 
was: None
   ```
   The important pieces are
   **[2024-03-28, 15:58:11 MDT] {{sql.py:374}} INFO - Running statement: SELECT 
now(), parameters: None  
   [2024-03-28, 15:58:11 MDT] {{sql.py:383}} INFO - Rows affected: 1**
   
   These 2 lines disappear, if we call pg_hook.run or get_records as top-level 
code. In the following code, if I were to uncomment the following line, in the 
code that follows:
   ```
   #rows_d = pg_hook.get_records(query)
   ```
   
   ```python
   from datetime import datetime
   
   from airflow import DAG
   from airflow.operators.python_operator import PythonOperator
   #from airflow.hooks import PostgresHook
   from airflow.hooks.postgres_hook import PostgresHook
   
   import logging
   
   _logger = logging.getLogger(__name__)
   _logger.setLevel(logging.INFO)
   
   pg_hook = PostgresHook(postgres_conn_id='redshift_conn') #, log_sql=True
   #pg_hook_task = PostgresHook(postgres_conn_id='redshift_conn')
   query = "SELECT now()";
   # the following line causes SQL logs to no longer display!
   #rows_d = pg_hook.get_records(query)
   
   def run_pg_query(**kwargs):
       print("start run_pg_query")
       query = "SELECT now()";
       pg_hook_task = PostgresHook(postgres_conn_id='redshift_conn') #, 
log_sql=True
       rows = pg_hook_task.get_records(query)
       print(f"data returned: {rows[0][0]}")
       print("end run_pg_query")        
   
   with DAG(
       dag_id='test_pg_query_dag',
       start_date=datetime(2024, 1, 1),
       schedule_interval=None,
       default_args={
           'depends_on_past': False,
           'start_date': datetime(2024, 1, 1),
           'retries': 0
       },
       max_active_runs=1,
       catchup=False,
   ) as dag:
       run_pg_query_task = PythonOperator(
           task_id='run_pg_query',
           python_callable=run_pg_query,
           provide_context=True,
           dag=dag)
   ```
   
   This used to work correctly in the previous version of MWAA we had. But 
seems to have started after we upgraded to 2.6.3.
   
   **The postgresql provider is: apache-airflow-providers-postgres 5.5.1.** 
   
   Is there a quick fix for this issue? Can I enable the logging within this 
dag? I have tried setting log_sql=True when creating a pg_hook, but that did 
not help.
   
   
   
   
   ### What you think should happen instead?
   
   The sql logs should still be captured even if the sql code is run from the 
top-level code of the DAG.
   
   
   ### How to reproduce
   
   DAG code to repro this issue has been provided above.
   Uncomment the line **#rows_d = pg_hook.get_records(query)** to repro the 
behavior.
   
   ### Operating System
   
   AWS MWAA
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-postgres    5.5.1  
   apache-airflow-providers-common-sql  1.5.2  
   
   
   
   
   ### Deployment
   
   Amazon (AWS) MWAA
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to