the-other-tim-brown commented on code in PR #17833:
URL: https://github.com/apache/hudi/pull/17833#discussion_r2850265163
##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/hudi/SparkAdapter.scala:
##########
@@ -374,4 +377,69 @@ trait SparkAdapter extends Serializable {
* @return A streaming [[DataFrame]]
*/
def createStreamingDataFrame(sqlContext: SQLContext, relation:
HadoopFsRelation, requiredSchema: StructType): DataFrame
+
+ /**
+ * Gets the VariantType DataType if supported by this Spark version.
+ * Spark 3.x returns None (VariantType not supported).
+ * Spark 4.x returns Some(VariantType).
+ *
+ * @return Option[DataType] - Some(VariantType) for Spark 4.x, None for
Spark 3.x
+ */
+ def getVariantDataType: Option[DataType]
+
+ /**
+ * Checks if two data types are equal for Parquet file format purposes.
+ * This handles version-specific types like VariantType (Spark 4.0+).
+ *
+ * Returns Some(true) if types are equal, Some(false) if not equal, or None
if
+ * this adapter doesn't handle this specific type comparison (fallback to
default logic).
+ *
+ * @param requiredType The required/expected data type
+ * @param fileType The data type from the file
+ * @return Option[Boolean] - Some(result) if handled by adapter, None
otherwise
+ */
+ def isDataTypeEqualForParquet(requiredType: DataType, fileType: DataType):
Option[Boolean]
Review Comment:
Is this limited to parquet or can it apply to other formats like ORC?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]