Har1sh-k commented on code in PR #66751:
URL: https://github.com/apache/airflow/pull/66751#discussion_r3358177748


##########
providers/apache/hive/src/airflow/providers/apache/hive/operators/hive_stats.py:
##########
@@ -29,6 +30,21 @@
 if TYPE_CHECKING:
     from airflow.providers.common.compat.sdk import Context
 
+# The table, the partition columns, and the metastore columns projected in the 
Presto
+# stats SELECT are interpolated as identifiers, which cannot be bound as SQL 
parameters.
+# Plain word identifiers are emitted unchanged; anything else is double-quoted 
with
+# embedded quotes doubled (how Presto/Trino escape identifiers). Kept local 
rather than
+# reusing common.sql's ``Dialect.escape_word``, which needs a live connection 
and does
+# not double embedded quotes.
+_PLAIN_IDENT_RE = re.compile(r"[A-Za-z_][A-Za-z0-9_]*")
+
+
+def _quote_presto_identifier(identifier: str) -> str:
+    """Quote a Presto/Trino identifier unless it is already a plain word 
identifier."""
+    if _PLAIN_IDENT_RE.fullmatch(identifier):
+        return identifier
+    return '"' + identifier.replace('"', '""') + '"'

Review Comment:
   Good catch. Fixed.
   
   `_quote_presto_identifier` now passes an already-quoted identifier through 
unchanged, so `"test-123-quoted"` stays `"test-123-quoted"` instead of becoming 
`"""test-123-quoted"""`. Malformed/unbalanced inputs (e.g. `"a"b"`) are still 
escaped, and because the passthrough is a `fullmatch`, anything with trailing 
SQL after the closing quote (e.g. `"x"; DROP TABLE users--`) is still escaped 
to one inert identifier. Added a unit test covering these cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to