majiayu000 opened a new pull request, #63530:
URL: https://github.com/apache/airflow/pull/63530

   Fixes incorrect SQL keyword parameterization in 
`MySqlHook.bulk_load_custom()` that causes MySQL drivers to quote 
`IGNORE`/`REPLACE` keywords as string literals, producing invalid SQL.
   
   ## Problem
   
   `bulk_load_custom()` passes `duplicate_key_handling` (e.g., `IGNORE`, 
`REPLACE`) and `extra_options` as parameterized query values via 
`cursor.execute(sql, parameters)`. The MySQL driver treats parameterized values 
as data and quotes them as string literals, producing invalid SQL like:
   
   ```sql
   LOAD DATA LOCAL INFILE '/tmp/file' 'IGNORE' INTO TABLE `my_table` 'FIELDS 
TERMINATED BY ...'
   ```
   
   This was introduced in PR #33328 which changed from string concatenation to 
parameterization for these keywords.
   
   ## Root Cause
   
   In `providers/mysql/src/airflow/providers/mysql/hooks/mysql.py`, the 
`bulk_load_custom` method builds the SQL as:
   
   ```python
   sql_statement = f"LOAD DATA LOCAL INFILE %s %s INTO TABLE `{table}` %s"
   parameters = (tmp_file, duplicate_key_handling, extra_options)
   ```
   
   Both `duplicate_key_handling` and `extra_options` are SQL syntax keywords, 
not data values. Only `tmp_file` is actual data that should be parameterized.
   
   ## Fix
   
   Changed `bulk_load_custom` to interpolate `duplicate_key_handling` and 
`extra_options` directly into the SQL statement via f-string, while keeping 
`tmp_file` as the sole parameterized value:
   
   ```python
   sql_statement = f"LOAD DATA LOCAL INFILE %s {duplicate_key_handling} INTO 
TABLE `{table}` {extra_options}"
   parameters = (tmp_file,)
   ```
   
   Updated existing tests (`test_bulk_load_custom`, 
`test_bulk_load_custom_hook_lineage`) to assert the new SQL shape and parameter 
tuple. Added a new parametrized test 
`test_bulk_load_custom_duplicate_key_not_parameterized` that validates both 
`IGNORE` and `REPLACE` appear literally in the executed SQL and only `tmp_file` 
is parameterized.
   
   closes: #62506
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [X] Yes — Claude Opus 4.6
   
   Generated-by: Claude Opus 4.6 following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to