MohammadHilal1 opened a new issue, #52238:
URL: https://github.com/apache/airflow/issues/52238

   ### Apache Airflow version
   
   3.0.2
   
   ### If "Other Airflow 2 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   I am creating a dag to extract transform and load data to the mysql database 
but it gives me the exception
   
    UserWarning: pandas only supports SQLAlchemy connectable 
(engine/connection) or database string URI or sqlite3 DBAPI2 connection. Other 
DBAPI2 objects are not tested. Please consider using SQLAlchemy. [2025-06-25, 
16:02:11 IST] {logging_mixin.py:188}
    INFO - Exception while inserting to db: 'Engine' object has no attribute 
'cursor' [2025-06-25, 16:02:11 IST] 
   
   The code that gives the exception is 
   
   def load_market_data(flattened_df_json):
           records = json.loads(flattened_df_json)
           df = pd.DataFrame(records)
           df['from'] = pd.to_datetime(df['from']).dt.date
           df['volume'] = df['volume'].astype('Int64')
           engine = 
create_engine("mysql+pymysql://root:1111@127.0.0.1:3307/airflow_db")
           try:
               df.to_sql(name="market_data", con=engine, if_exists="append", 
index=False)
               print("Data loaded successfully")
           except Exception as e:
               print("Exception while inserting to db:", str(e))
               raise
   
   Versions using:
   
   Ubuntu 24.04 LTS (64-bit)
   Python 3.12.3
   Pandas 2.3.0
   MySQL 8.0.42
   sqlAlchemy 1.4.54
   airflow 2.9.1
   
   ### What you think should happen instead?
   
   _No response_
   
   ### How to reproduce
   
   from airflow import DAG
   import pandas as pd
   from airflow.decorators import task
   from datetime import datetime, timedelta
   import requests
   import json
   from sqlalchemy import create_engine
   
   with DAG(
       dag_id='market_etl',
       start_date=datetime(2025, 6, 1),
       schedule='* * * * *',
       catchup=False,
       max_active_runs = 1,
       tags=['example'],
       default_args={
           "retries" : 3,
           "retry_delay" : timedelta(minutes=5)
       }
   ) as dag:
       @task()
   
       def hit_polygen_api(**context):
           stock_ticker = "AMZN"
           polygon_api_key = "6q8VWReS9OColVvppbveW432iY4092RY"
           date_string = context.get("ds")
           date_obj = datetime.strptime(date_string, "%Y-%m-%d") - 
timedelta(days=1)
           date_string = date_obj.strftime("%Y-%m-%d")
           url = 
f"https://api.polygon.io/v1/open-close/{stock_ticker}/{date_string}?adjusted=true&apiKey={polygon_api_key}";
           response = requests.get(url)
           return response.json()
       
       @task()
   
       def flatten_market_data(polygon_response, **context):
           date_string = context.get("ds")
           date_obj = datetime.strptime(date_string, "%Y-%m-%d") - 
timedelta(days=1)
           date_string = date_obj.strftime("%Y-%m-%d")
           columns= {
               "status": "Closed",
               "from": date_string,
               "symbol": "AMZN",
               "open": "None",
               "high": "None",
               "low": "None",
               "close": "None",
               "volume":"None"
           }
           flattened_record = {
           header_name: polygon_response.get(header_name, default_value)
           for header_name, default_value in columns.items()
           }
           flattened_df = pd.DataFrame([flattened_record])
           return flattened_df.to_json(orient="records")
       
       @task
   
       def load_market_data(flattened_df_json):
           records = json.loads(flattened_df_json)
           df = pd.DataFrame(records)
           df['from'] = pd.to_datetime(df['from']).dt.date
           df['volume'] = df['volume'].astype('Int64')
           engine = 
create_engine("mysql+pymysql://root:1111@127.0.0.1:3307/airflow_db")
           try:
               df.to_sql(name="market_data", con=engine, if_exists="append", 
index=False)
               print("Data loaded successfully")
           except Exception as e:
               print("Exception while inserting to db:", str(e))
               raise
   
       raw_market_data = hit_polygen_api()
       transformed_market_data = flatten_market_data(raw_market_data)
       load_market_data(transformed_market_data)
   
   ### Operating System
   
   Ubuntu 24.04 LTS (64-bit)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-common-io       1.3.1
   apache-airflow-providers-common-sql      1.12.0
   apache-airflow-providers-fab             1.0.4
   apache-airflow-providers-ftp             3.8.0
   apache-airflow-providers-http            4.10.1
   apache-airflow-providers-imap            3.5.0
   apache-airflow-providers-mysql           5.5.4
   apache-airflow-providers-smtp            1.6.1
   apache-airflow-providers-sqlite          3.7.1
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to