lostluck commented on a change in pull request #12426:
URL: https://github.com/apache/beam/pull/12426#discussion_r465189010



##########
File path: 
model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml
##########
@@ -384,3 +384,31 @@ nested: false
 examples:
   "\x02\x01\x02\x01": {f_bool: True, f_bytes: null}
   "\x02\x00\x00\x04ab\x00c": {f_bool: False, f_bytes: "ab\0c"}
+
+---
+
+# Binary data generated with the python SDK:
+#
+# import typing
+# import apache_beam as beam
+# class Test(typing.NamedTuple):
+#   f_map: typing.Mapping[str,int]
+# schema = beam.typehints.schemas.named_tuple_to_schema(Test)
+# coder = beam.coders.row_coder.RowCoder(schema)
+# print("payload = %s" % schema.SerializeToString())
+# examples = (Test(f_map={}),
+#             Test(f_map={"foo": 9001, "bar": 9223372036854775807}),
+#             Test(f_map={"everything": None, "is": None, "null!": None, 
"¯\_(ツ)_/¯": None}))
+# for example in examples:
+#   print("example = %s" % coder.encode(example))
+coder:
+  urn: "beam:coder:row:v1"
+  # f_map: map<str, nullable int64>
+  payload: 
"\n\x15\n\x05f_map\x1a\x0c*\n\n\x02\x10\x07\x12\x04\x08\x01\x10\x04\x12$d8c8f969-14e6-457f-a8b5-62a1aec7f1cd"
+  # map ordering is non-deterministic
+  non_deterministic: True
+nested: false

Review comment:
       As it stands, this is confusing for SDK authors writing tests against 
standard_coders.yaml, as I've got the go testing written I need to explicitly 
ignore the nested field for the row coders because they're all set to 
nested:false, rather than nested:true.
   
   This is per my thread on the dev list: 
https://lists.apache.org/thread.html/r7da098363e6ce607ce96f9fbedb08f9f4757bedd68846aaeba5dd4f0%40%3Cdev.beam.apache.org%3E
   
   Portability only ever supports nested coders. The semantics of 
standard_coders.yaml say that 
   ```
   #   nested: a boolean meaning whether the coder was used in the nested 
context. Missing means to
   #           test both contexts, a shorthand for when the coder is invariant 
across context.
   ```
   
https://github.com/apache/beam/blob/587dde57cbb2b0095a1fa04b59798d1b62c66f18/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L24
   Meaning that nested: false means that the outer most encoding has the length 
prefix if necessary.
   
   Structually, there's never a reason for a single schema value to have the 
wrapped length prefix (it's orthogonal to this aspect of the encoding, as any 
sub component is always nested as needed), so it's not included in the various 
payload examples.
   
   So, I re-iterate: Why is nested: false, instead of nested true if the coding 
is going to be identical in both context?
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to