Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21299#discussion_r188650358
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala ---
    @@ -90,13 +92,42 @@ object SQLExecution {
        * thread from the original one, this method can be used to connect the 
Spark jobs in this action
        * with the known executionId, e.g., 
`BroadcastExchangeExec.relationFuture`.
        */
    -  def withExecutionId[T](sc: SparkContext, executionId: String)(body: => 
T): T = {
    +  def withExecutionId[T](sparkSession: SparkSession, executionId: 
String)(body: => T): T = {
    +    val sc = sparkSession.sparkContext
         val oldExecutionId = sc.getLocalProperty(SQLExecution.EXECUTION_ID_KEY)
    +    withSQLConfPropagated(sparkSession) {
    +      try {
    +        sc.setLocalProperty(SQLExecution.EXECUTION_ID_KEY, executionId)
    +        body
    +      } finally {
    +        sc.setLocalProperty(SQLExecution.EXECUTION_ID_KEY, oldExecutionId)
    +      }
    +    }
    +  }
    +
    +  def withSQLConfPropagated[T](sparkSession: SparkSession)(body: => T): T 
= {
    +    // Set all the specified SQL configs to local properties, so that they 
can be available at
    +    // the executor side.
    --- End diff --
    
    properties are serialized per task.  how unusual would it be for there to 
be a large list of properties?  if that would be reasonable, then it might make 
more sense to use a Broadcast.
    
    (separately, task serialization should probably avoid re-serializing the 
properties every time, but this could make that existing issue much worse,)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to