YoannAbriel opened a new pull request, #62865:
URL: https://github.com/apache/airflow/pull/62865

   # fix: stop parameterizing SQL keywords in MySQL bulk_load_custom
   
   ## Problem
   
   `MySqlHook.bulk_load_custom()` passes `duplicate_key_handling` (e.g. 
`IGNORE`, `REPLACE`) and `extra_options` as parameterized query values via 
`cursor.execute(sql, parameters)`. The MySQL driver treats parameterized values 
as data and quotes them as string literals, producing invalid SQL like:
   
   ```sql
   LOAD DATA LOCAL INFILE '/tmp/file' 'IGNORE' INTO TABLE `my_table` 'FIELDS 
TERMINATED BY ...'
   ```
   
   This was introduced in PR #33328 which changed from string concatenation to 
parameterization for these keywords. A prior fix attempt existed (#41078) but 
was closed without merge.
   
   ## Root Cause
   
   In `providers/mysql/src/airflow/providers/mysql/hooks/mysql.py`, the 
`bulk_load_custom` method builds the SQL as:
   
   ```python
   sql_statement = f"LOAD DATA LOCAL INFILE %s %s INTO TABLE `{table}` %s"
   parameters = (tmp_file, duplicate_key_handling, extra_options)
   ```
   
   Both `duplicate_key_handling` and `extra_options` are SQL syntax keywords, 
not data values. Only `tmp_file` is actual data that should be parameterized.
   
   ## Fix
   
   Changed `bulk_load_custom` to interpolate `duplicate_key_handling` and 
`extra_options` directly into the SQL statement via f-string, while keeping 
`tmp_file` as the sole parameterized value:
   
   ```python
   sql_statement = f"LOAD DATA LOCAL INFILE %s {duplicate_key_handling} INTO 
TABLE `{table}` {extra_options}"
   parameters = (tmp_file,)
   ```
   
   Updated existing tests (`test_bulk_load_custom`, 
`test_bulk_load_custom_hook_lineage`) to assert the new SQL shape and parameter 
tuple. Added a new parametrized test 
`test_bulk_load_custom_duplicate_key_not_parameterized` that validates both 
`IGNORE` and `REPLACE` appear literally in the executed SQL and only `tmp_file` 
is parameterized.
   
   Closes: #62506
   
   ##### Was generative AI tooling used to co-author this PR?
   - [X] Yes — Claude Code
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to