Shekharrajak commented on code in PR #2991:
URL: https://github.com/apache/datafusion-comet/pull/2991#discussion_r2652181481
##########
spark/src/main/scala/org/apache/comet/serde/operator/CometDataWritingCommand.scala:
##########
@@ -50,6 +50,11 @@ object CometDataWritingCommand extends
CometOperatorSerde[DataWritingCommandExec
override def getSupportLevel(op: DataWritingCommandExec): SupportLevel = {
op.cmd match {
case cmd: InsertIntoHadoopFsRelationCommand =>
+ // Skip INSERT OVERWRITE DIRECTORY operations (catalogTable is None
for directory writes)
+ if (cmd.catalogTable.isEmpty) {
Review Comment:
<img width="1716" height="937" alt="Image"
src="https://github.com/user-attachments/assets/3a4ec7ca-bb45-4cd8-a2b6-3b2d5e3b1382"
/>
fix for error :
```
RROR org.apache.spark.sql.execution.command.InsertIntoDataSourceDirCommand:
Failed to write to directory
Some(file:/__w/datafusion-comet/datafusion-comet/apache-spark/target/tmp/spark-76b62d31-5bd6-4d4b-9770-262cb08e84f3)
org.apache.spark.sql.AnalysisException: [COLUMN_ALREADY_EXISTS] The column
`id` already exists. Choose another name or rename the existing column.
SQLSTATE: 42711
at
org.apache.spark.sql.errors.QueryCompilationErrors$.columnAlreadyExistsError(QueryCompilationErrors.scala:2700)
at
org.apache.spark.sql.util.SchemaUtils$.checkColumnNameDuplication(SchemaUtils.scala:151)
at
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:86)
at
org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:117)
at
org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:115)
at
org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:129)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$2(QueryExecution.scala:155)
[info] - SPARK-25389 INSERT OVERWRITE LOCAL DIRECTORY ... STORED AS with
duplicated names(caseSensitivity=true, format=orc) (22 milliseconds)
18:44:25.173 ERROR
org.apache.spark.sql.execution.command.InsertIntoDataSourceDirCommand: Failed
to write to directory
Some(file:/__w/datafusion-comet/datafusion-comet/apache-spark/target/tmp/spark-76ef391d-5d5f-4997-afb4-97ac714c1697)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]