[ 
https://issues.apache.org/jira/browse/SPARK-57189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-57189:
-----------------------------------
    Labels: pull-request-available  (was: )

> handleSqlCommand executes SQL twice and lets blocked Commands bypass the SDP 
> guard for WITH_RELATIONS
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-57189
>                 URL: https://issues.apache.org/jira/browse/SPARK-57189
>             Project: Spark
>          Issue Type: Bug
>          Components: Connect
>    Affects Versions: 5.0.0
>            Reporter: Norio Akagi
>            Priority: Major
>              Labels: pull-request-available
>
> For requests originating from Spark Declarative Pipelines (SDP) , 
> SparkConnectPlanner.handleSqlCommand calls
> PipelinesHandler.blockUnsupportedSqlCommand with a queryPlan built via
> transformRelation(relation). When the relation is a WITH_RELATIONS
> matching isValidSQLWithRefs, this transformation chain leads to:
> {noformat}
> transformRelation
>   -> transformWithRelations
>     -> transformSqlWithRefs
>       -> executeSQLWithRefs
>         -> executeSQL
>           -> session.sql(...){noformat}
> executeSQLWithRefs explicitly comments "Eagerly execute commands of the 
> provided SQL string", and session.sql triggers actual execution of any 
> Command/DDL/DML in the root SQL. Commands embedded in reference SubqueryAlias 
> inputs also execute when eagerlyExecuteCommands walks the resolved plan tree.
> This causes two issues:
> 1. Bypassed guard. blockUnsupportedSqlCommand checks whether queryPlan
> is a Command subclass (CreateTableAsSelect, InsertIntoStatement,
> etc.). After execution, the resulting plan is wrapped as CommandResult, which 
> is not in the blocklist. The guard silently lets through exactly the things 
> it is supposed to block, and the Commands have already mutated state by the 
> time the guard runs.
> 2. Double execution. After the guard, handleSqlCommand falls through to the 
> normal execution path which calls executeSQLWithRefs again. Any DDL/DML in 
> the request runs twice, causing duplicate side effects.
> The guard should match the runtime's execution surface: inspect both the root 
> SQL and each reference's input, without itself triggering any execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to