gopidesupavan commented on code in PR #61794:
URL: https://github.com/apache/airflow/pull/61794#discussion_r2814150757


##########
providers/postgres/src/airflow/providers/postgres/hooks/postgres.py:
##########
@@ -691,3 +691,11 @@ def insert_rows(
                     nb_rows += len(chunked_rows)
                     self.log.info("Loaded %s rows into %s so far", nb_rows, 
table)
         self.log.info("Done loading. Loaded a total of %s rows into %s", 
nb_rows, table)
+
+    def get_schema(self, table_name: str):
+        from airflow.providers.common.sql.hooks.handlers import 
fetch_all_handler
+
+        return self.run(
+            sql=f"""SELECT column_name, data_type FROM 
information_schema.columns WHERE table_name = '{table_name}';""",

Review Comment:
   > `table_name` comes from user-provided DataSourceConfig and the f-string is 
passed directly into the SQL. This is a SQL injection risk — for example below 
query can be executed to drop the table
   > 
   > `SELECT column_name, data_type FROM information_schema.columns WHERE 
table_name = ''; DROP TABLE users; --'; `
   > 
   > This happens before the LLM is called, so the ValidateSQL safety layer 
doesn't protect against it.
   
   @shivaam no this get_schema is for mainly to provide schema information to 
LLM, no data pulling at this step.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to