ccciudatu commented on a change in pull request #13572:
URL: https://github.com/apache/beam/pull/13572#discussion_r548430771



##########
File path: 
sdks/java/io/thrift/src/main/java/org/apache/beam/sdk/io/thrift/ThriftSchema.java
##########
@@ -90,17 +95,17 @@
  *       parameter exists.
  *   <li>All non-union types have a corresponding java field with the same 
name for every field in
  *       the original thrift source file.
- *   <li>The underlying {@link FieldMetaData#getStructMetaDataMap(Class) 
metadata maps} are {@link
- *       java.util.EnumMap enum maps}, so the natural order of the field keys 
is preserved.
  * </ul>
  *
  * <p>Thrift typedefs for container types (and possibly others) do not 
preserve the full type
  * information. For this reason, this class allows for {@link #custom() manual 
registration} of such
  * "lossy" typedefs with their corresponding beam types.
  *
- * <p>Note: upon restoring the same thrift object from a Beam {@link
- * org.apache.beam.sdk.values.Row}, the {@link TBase#isSet(TFieldIdEnum) isSet 
flag} will be {@code
- * true} for all fields, except for non-primitive types with no default values.
+ * <p>Note: Thrift encoding and decoding are not fully symmetrical, i.e. the 
{@link
+ * TBase#isSet(TFieldIdEnum) isSet} flag may not be preserved upon converting 
a thrift object to a
+ * beam row and back. On encoding, we extract all thrift values, no matter if 
the fields are set or
+ * not. On decoding, we set all non-{@code null} beam row values to the 
corresponding thrift fields,
+ * leaving the rest unset.

Review comment:
       I changed the set-ALL-fields policy to set non-null only, as I think 
this is safer (no NPE if primitive thrift fields are null in the beam row), 
easier to reason about and more natural for thrift clients (who are used to 
check if fields are set before using them).

##########
File path: 
sdks/java/io/thrift/src/main/java/org/apache/beam/sdk/io/thrift/ThriftSchema.java
##########
@@ -90,17 +95,17 @@
  *       parameter exists.
  *   <li>All non-union types have a corresponding java field with the same 
name for every field in
  *       the original thrift source file.
- *   <li>The underlying {@link FieldMetaData#getStructMetaDataMap(Class) 
metadata maps} are {@link
- *       java.util.EnumMap enum maps}, so the natural order of the field keys 
is preserved.

Review comment:
       This is still the case with thrift metadata, but it's no longer a 
"strong assumption" for this class, as now the field order is infered from the 
schema, allowing for field reordering.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to