parthchandra commented on code in PR #3718:
URL: https://github.com/apache/datafusion-comet/pull/3718#discussion_r2956462818


##########
native/core/src/errors.rs:
##########
@@ -474,6 +476,54 @@ fn throw_spark_error_as_json(
     )
 }
 
+/// Try to convert a DataFusion "Unable to get field named" error into a 
SparkError.
+/// DataFusion produces this error when reading Parquet files with duplicate 
field names
+/// in case-insensitive mode. For example, if a Parquet file has columns "B" 
and "b",
+/// DataFusion may deduplicate them and report: Unable to get field named "b". 
Valid
+/// fields: ["A", "B"]. When the requested field has a case-insensitive match 
among the
+/// valid fields, we convert this to Spark's _LEGACY_ERROR_TEMP_2093 error.
+fn try_convert_duplicate_field_error(error_msg: &str) -> Option<SparkError> {

Review Comment:
   late comment: 
   You're right, this is overkill. We can, if the need arises (as in this 
case), not convert the `_LEGACY_ ` errors? Or even more broadly, not do this 
for the errors that originate in `QueryExecutionErrors`. 
   The error framework is important for the errors in 
`org.apache.spark.sql.errors.ExecutionErrors` because those are SQL errors and 
correspond to some pre-defined error codes in ANSI.  But for 
`QueryExecutionErrors` we do not have to be strict. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to