The GitHub Actions job "Tests (AMD)" on airflow.git/parameterize-hive-stats-sql 
has failed.
Run started by GitHub user Har1sh-k (triggered by eladkal).

Head commit for run:
4bb09469870b904aa198f76c97384d740f8b8d3c / Harish Kolla <[email protected]>
Use parameterized queries in HiveStatsCollectionOperator

HiveStatsCollectionOperator builds its bookkeeping SQL (the SELECT
and DELETE against hive_stats in the MySQL metastore, and the
SELECT ... FROM <table> WHERE <partition_key> = '<value>' against
Presto) by f-string-interpolating template-rendered fields (table,
partition, dttm) directly into raw SQL strings.

Per the security model in airflow-core/docs/security/security_model.rst
and airflow-core/docs/security/sql.rst, Dag authors are trusted users
responsible for sanitizing input before passing it to operators. The
change here is defense-in-depth so that the operator does not rely on
each Dag author to sanitize.

The MySQL bookkeeping SELECT and DELETE now use %s placeholders with
the parameters= kwarg of MySqlHook.get_records / .run, so the values
are bound by the driver instead of interpolated.

For the Presto SELECT, partition values are passed as bound parameters
using the hook's declared placeholder (PrestoHook overrides the
DbApiHook default to "?"). Identifiers (table name and partition
column names) cannot be parameterized in standard SQL — they are
validated against ^[A-Za-z_][A-Za-z0-9_]*(?:\.[A-Za-z_][A-Za-z0-9_]*)?$
for the table and ^[A-Za-z_][A-Za-z0-9_]*$ for partition keys, raising
AirflowException on mismatch.

Report URL: https://github.com/apache/airflow/actions/runs/25712478965

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to