[
https://issues.apache.org/jira/browse/SPARK-57189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-57189:
-----------------------------------
Labels: pull-request-available (was: )
> handleSqlCommand executes SQL twice and lets blocked Commands bypass the SDP
> guard for WITH_RELATIONS
> -----------------------------------------------------------------------------------------------------
>
> Key: SPARK-57189
> URL: https://issues.apache.org/jira/browse/SPARK-57189
> Project: Spark
> Issue Type: Bug
> Components: Connect
> Affects Versions: 5.0.0
> Reporter: Norio Akagi
> Priority: Major
> Labels: pull-request-available
>
> For requests originating from Spark Declarative Pipelines (SDP) ,
> SparkConnectPlanner.handleSqlCommand calls
> PipelinesHandler.blockUnsupportedSqlCommand with a queryPlan built via
> transformRelation(relation). When the relation is a WITH_RELATIONS
> matching isValidSQLWithRefs, this transformation chain leads to:
> {noformat}
> transformRelation
> -> transformWithRelations
> -> transformSqlWithRefs
> -> executeSQLWithRefs
> -> executeSQL
> -> session.sql(...){noformat}
> executeSQLWithRefs explicitly comments "Eagerly execute commands of the
> provided SQL string", and session.sql triggers actual execution of any
> Command/DDL/DML in the root SQL. Commands embedded in reference SubqueryAlias
> inputs also execute when eagerlyExecuteCommands walks the resolved plan tree.
> This causes two issues:
> 1. Bypassed guard. blockUnsupportedSqlCommand checks whether queryPlan
> is a Command subclass (CreateTableAsSelect, InsertIntoStatement,
> etc.). After execution, the resulting plan is wrapped as CommandResult, which
> is not in the blocklist. The guard silently lets through exactly the things
> it is supposed to block, and the Commands have already mutated state by the
> time the guard runs.
> 2. Double execution. After the guard, handleSqlCommand falls through to the
> normal execution path which calls executeSQLWithRefs again. Any DDL/DML in
> the request runs twice, causing duplicate side effects.
> The guard should match the runtime's execution surface: inspect both the root
> SQL and each reference's input, without itself triggering any execution.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]