aicam commented on code in PR #4995:
URL: https://github.com/apache/texera/pull/4995#discussion_r3220265360
##########
common/workflow-core/src/main/scala/org/apache/texera/amber/util/ArrowUtils.scala:
##########
@@ -94,19 +94,23 @@ object ArrowUtils extends LazyLogging {
/**
* Converts an Arrow Schema into Texera Schema.
- * Checks field metadata to detect LARGE_BINARY types.
+ * Checks field metadata to recover types that share an Arrow representation
+ * (LARGE_BINARY and ANY both ride on Utf8).
*
* @param arrowSchema The Arrow Schema to be converted.
* @return A Texera Schema.
*/
def toTexeraSchema(arrowSchema: org.apache.arrow.vector.types.pojo.Schema):
Schema =
Schema(
arrowSchema.getFields.asScala.map { field =>
- val isLargeBinary = Option(field.getMetadata)
- .exists(m => m.containsKey("texera_type") && m.get("texera_type") ==
"LARGE_BINARY")
+ val taggedType = Option(field.getMetadata)
+ .flatMap(m => Option(m.get("texera_type")))
Review Comment:
I guess we can use a constant instead of repeating string "texera_type"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]