iemejia commented on a change in pull request #14858:
URL: https://github.com/apache/beam/pull/14858#discussion_r644308777
##########
File path:
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
##########
@@ -906,6 +906,23 @@ private void readObject(ObjectInputStream in) throws
IOException, ClassNotFoundE
.map(x -> getFieldSchema(x.getType(), x.getName(),
namespace))
.collect(Collectors.toList()));
break;
+
+ case "NVARCHAR":
+ case "VARCHAR":
+ case "LONGNVARCHAR":
+ case "LONGVARCHAR":
+ baseType = org.apache.avro.Schema.create(Type.STRING);
Review comment:
These should better be 'proper' Logical types with the associated size.
We should probably align the Avro schema representation with the way other
systems (Hive / Spark) represent these types into Avro. For ref
http://apache-avro.679487.n3.nabble.com/Standardizing-char-and-varchar-logical-types-td4038622.html
or from the source
https://github.com/apache/hive/blob/5d268834a5f5278ea76399f8af0d0ab043ae0b45/serde/src/test/resources/avro-struct.avsc#L11
Keeping the full information in the internal representation is ok but we
should not be losing the maxLength (size) on the type information.
Apart of this the rest of the PR looks pretty good. Thanks for working on
this!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]