Hello dev@! I have questions.

Context: I'm adding Beam Schemas to the Go SDK, and on the way I'm
validating the Go SDK coders against standard_coders.yaml, per BEAM-7009[0].

*When is it reasonable for a runner to send an SDK an unnested byte or
string coder (AKA, no length prefixing)?*
*What contexts are considered "unnested"? *"Nested" isn't a documented
property anywhere (except probably in the Python or Java code, which isn't
useful from a portability perspective), so it's not clear how SDK
developers are supposed to know what it means and does.

Based on experience, rather than documentation in the proto spec [1] or in
standard_coders.yaml [2], there's no portable specification of what
contexts are supposed to be nested and when, which implies it's a holdover
from pre-portability.

But most importantly, *when is it actually used? *

I understand there's a theoretical value in avoiding needing the length
ahead of time when encoding very large single elements, but is that
property ever taken advantage of anywhere?

Currently, the Go SDK doesn't support unnested coders at all. All :bytes:v1
and string_utf8:v1 coders are assumed to be length prefixed. So values and
generated by the SDK will always be marked as LP for those variable length
coders, and that's what the pipeline generates for them.
What's not clear to me is when an SDK should be assuming the bytes aren't
length prefixed, as that's not documented anywhere, nor along with intended
purpose for the distinction.

I'm happy to go ahead and add such documentation to the protos or
standard_coders.yaml file for posterity, but I can't until I understand the
situation better. I'd like it to be documented so new SDK authors don't run
into the same confusions I have.

Thanks, and Cheers.
Robert Burke

[0] https://issues.apache.org/jira/browse/BEAM-7009
[1]
https://github.com/apache/beam/blob/a5b2046b10bebc59c5bde41d4cb6498058fdada2/model/pipeline/src/main/proto/beam_runner_api.proto#L672
[2]
https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L18

Reply via email to