ulysses-you commented on code in PR #5480:
URL: https://github.com/apache/incubator-gluten/pull/5480#discussion_r1580445934


##########
backends-velox/src/main/scala/org/apache/spark/sql/execution/VeloxColumnarWriteFilesExec.scala:
##########
@@ -308,12 +321,39 @@ class VeloxColumnarWriteFilesExec(
     }
   }
 
-  override protected def withNewChildInternal(newChild: SparkPlan): 
WriteFilesExec =
-    new VeloxColumnarWriteFilesExec(
-      newChild,
-      fileFormat,
-      partitionColumns,
-      bucketSpec,
-      options,
-      staticPartitions)
+  override def left: SparkPlan = child
+
+  // This is a workaround for FileFormatWriter#write. Vanilla Spark (version 
>= 3.4) requires for
+  // a plan that has at least one node exactly of type `WriteFilesExec` that 
is a Scala case-class,
+  // to decide to choose new `#executeWrite` code path over the legacy 
`#execute` for write
+  // operation.
+  //
+  // So we add a no-op `WriteFilesExec` child to let Spark pick the new code 
path.
+  //
+  // See: FileFormatWriter#write
+  // See: V1Writes#getWriteFilesOpt
+  override val right: SparkPlan =
+    WriteFilesExec(NoopLeaf(), fileFormat, partitionColumns, bucketSpec, 
options, staticPartitions)

Review Comment:
   @zhztheplayer sorry for the post review. It seems quick hack to me and 
really hard to read. I prefer to inherit `case class WriteFileExec`. BTW, This 
workaround has two issues:
   
   1. What does UI show the new SQL graph ?
   2. It would cause extra local sort if the ordering matches, see the 
[code](https://github.com/apache/spark/blob/79357c8ccd22729a074c42f700544e7e3f023a8d/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala#L157-L160),
 Spark would get the ordering of WriteFileExec's child, but this workaround 
miss the information.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to