Hi all,
It seems a bit unfortunate that there isn’t a portable way to serialize a
boolean value.
I’m working on porting my external PubsubIO PR over to use the improved
schema-based external transform API in python, but because of this
limitation I can’t use boolean values. For example, this fails:
ReadFromPubsubSchema = typing.NamedTuple(
'ReadFromPubsubSchema',
[
('topic', typing.Optional[unicode]),
('subscription', typing.Optional[unicode]),
('id_label', typing.Optional[unicode]),
('with_attributes', bool),
('timestamp_attribute', typing.Optional[unicode]),
]
)
It fails because coders.get_coder(bool) returns the non-portable pickle
coder.
In the short term I can hack something into the external transform API to
use varint coder for bools, but this kind of hacky approach to portability
won’t work in scenarios where round-tripping is required without user
intervention. In other words, in python it is not uncommon to test if x is
True, in which case the integer 1 would fail this test. All of that is to
say that a BooleanCoder would be a convenient way to ensure the proper type
is used everywhere.
So, I was just wondering why it’s not there? Are there concerns over
whether booleans are universal enough to make part of the portability
standard?
-chad