ccciudatu commented on a change in pull request #13572:
URL: https://github.com/apache/beam/pull/13572#discussion_r548430771
##########
File path:
sdks/java/io/thrift/src/main/java/org/apache/beam/sdk/io/thrift/ThriftSchema.java
##########
@@ -90,17 +95,17 @@
* parameter exists.
* <li>All non-union types have a corresponding java field with the same
name for every field in
* the original thrift source file.
- * <li>The underlying {@link FieldMetaData#getStructMetaDataMap(Class)
metadata maps} are {@link
- * java.util.EnumMap enum maps}, so the natural order of the field keys
is preserved.
* </ul>
*
* <p>Thrift typedefs for container types (and possibly others) do not
preserve the full type
* information. For this reason, this class allows for {@link #custom() manual
registration} of such
* "lossy" typedefs with their corresponding beam types.
*
- * <p>Note: upon restoring the same thrift object from a Beam {@link
- * org.apache.beam.sdk.values.Row}, the {@link TBase#isSet(TFieldIdEnum) isSet
flag} will be {@code
- * true} for all fields, except for non-primitive types with no default values.
+ * <p>Note: Thrift encoding and decoding are not fully symmetrical, i.e. the
{@link
+ * TBase#isSet(TFieldIdEnum) isSet} flag may not be preserved upon converting
a thrift object to a
+ * beam row and back. On encoding, we extract all thrift values, no matter if
the fields are set or
+ * not. On decoding, we set all non-{@code null} beam row values to the
corresponding thrift fields,
+ * leaving the rest unset.
Review comment:
I changed the set-ALL-fields policy to set non-null only, as I think
this is safer (no NPE if primitive thrift fields are null in the beam row),
easier to reason about and more natural for thrift clients (who are used to
check if fields are set before using them).
##########
File path:
sdks/java/io/thrift/src/main/java/org/apache/beam/sdk/io/thrift/ThriftSchema.java
##########
@@ -90,17 +95,17 @@
* parameter exists.
* <li>All non-union types have a corresponding java field with the same
name for every field in
* the original thrift source file.
- * <li>The underlying {@link FieldMetaData#getStructMetaDataMap(Class)
metadata maps} are {@link
- * java.util.EnumMap enum maps}, so the natural order of the field keys
is preserved.
Review comment:
This is still the case with thrift metadata, but it's no longer a
"strong assumption" for this class, as now the field order is infered from the
schema, allowing for field reordering.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]