andygrove commented on code in PR #2828:
URL: https://github.com/apache/datafusion-comet/pull/2828#discussion_r2572124993
##########
native/core/src/execution/operators/parquet_writer.rs:
##########
@@ -197,11 +224,18 @@ impl ExecutionPlan for ParquetWriterExec {
Arc::clone(&input_schema)
};
+ // Determine the write path (work_dir for temp files, or output_path
for direct write)
Review Comment:
This is now removed.
##########
spark/src/main/scala/org/apache/spark/sql/comet/CometNativeWriteExec.scala:
##########
@@ -86,27 +154,149 @@ case class CometNativeWriteExec(nativeOp: Operator,
child: SparkPlan, outputPath
// Capture metadata before the transformation
val numPartitions = childRDD.getNumPartitions
val numOutputCols = child.output.length
+ val broadcastedCommitter = committer.map(c => sparkContext.broadcast(c))
Review Comment:
Thanks for catching this. I removed the broadcast and captured the variable
as part of the closure instead.
##########
native/proto/src/proto/operator.proto:
##########
@@ -241,6 +241,26 @@ message ParquetWriter {
string output_path = 1;
CompressionCodec compression = 2;
repeated string column_names = 4;
+ // Working directory for temporary files (used by FileCommitProtocol)
+ // If not set, files are written directly to output_path
+ optional string work_dir = 5;
+ // Job ID for tracking this write operation
+ optional string job_id = 6;
+ // Task attempt ID for this specific task
+ optional int32 task_attempt_id = 7;
+}
+
+// Information about a file written by ParquetWriter
+// This is returned to the JVM for commit protocol coordination
+message WrittenFileInfo {
Review Comment:
Thanks. I have removed this now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]