[GitHub] [beam] iemejia commented on a change in pull request #14858: [BEAM-12385] Handle VARCHAR and Date-time JDBC specific logical types in AvroUtils.

GitBox Wed, 02 Jun 2021 13:46:56 -0700


iemejia commented on a change in pull request #14858:
URL: https://github.com/apache/beam/pull/14858#discussion_r644308777




##########
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java
##########
@@ -906,6 +906,23 @@ private void readObject(ObjectInputStream in) throws 
IOException, ClassNotFoundE
                         .map(x -> getFieldSchema(x.getType(), x.getName(), 
namespace))
                         .collect(Collectors.toList()));
             break;
+
+          case "NVARCHAR":
+          case "VARCHAR":
+          case "LONGNVARCHAR":
+          case "LONGVARCHAR":
+            baseType = org.apache.avro.Schema.create(Type.STRING);

Review comment:
       These should better be 'proper' Logical types with the associated size. 
We should probably align the Avro schema representation with the way other 
systems (Hive / Spark) represent these types into Avro. For ref 
http://apache-avro.679487.n3.nabble.com/Standardizing-char-and-varchar-logical-types-td4038622.html
   or from the source 
https://github.com/apache/hive/blob/5d268834a5f5278ea76399f8af0d0ab043ae0b45/serde/src/test/resources/avro-struct.avsc#L11
   
   Keeping the full information in the internal representation is ok but we 
should not be losing the maxLength (size) on the type information.
   
   Apart of this the rest of the PR looks pretty good. Thanks for working on 
this!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] iemejia commented on a change in pull request #14858: [BEAM-12385] Handle VARCHAR and Date-time JDBC specific logical types in AvroUtils.

Reply via email to