Are there any dissenting votes to making a BooleanCoder a standard
(portable) coder?

I'm happy to make a PR to implement a BooleanCoder in python (and to add
the Java BooleanCoder to the ModelCoderRegistrar) if everyone agrees that
this is useful.

-chad


On Fri, Sep 27, 2019 at 3:32 PM Robert Bradshaw <[email protected]> wrote:

> I think boolean is useful to have. What I'm more skeptical of is
> adding standard types for variations like UnsignedInteger16, etc. that
> don't have natural representations in all languages.
>
> On Fri, Sep 27, 2019 at 2:46 PM Brian Hulette <[email protected]> wrote:
> >
> > Some more context from an offline discussion I had with +Robert Bradshaw
> a while ago: We both agreed all of the coders listed in BEAM-7996 should be
> implemented in Python, but didn't come to a conclusion on whether or not
> they should actually be _standard_ coders, versus just being implicitly
> standard as part of row coder.
> >
> > On Fri, Sep 27, 2019 at 2:29 PM Kenneth Knowles <[email protected]> wrote:
> >>
> >> Yes, noted here:
> https://github.com/apache/beam/pull/9188/files#diff-f0d64c2cfc4583bfe2a7e5ee59818ae2R678
> and that links to https://issues.apache.org/jira/browse/BEAM-7996
> >>
> >> Kenn
> >>
> >> On Fri, Sep 27, 2019 at 12:57 PM Reuven Lax <[email protected]> wrote:
> >>>
> >>> Java has one, implemented as a byte coder. My guess is that nobody has
> gotten around to implementing it yet for portability.
> >>>
> >>> On Fri, Sep 27, 2019 at 12:44 PM Chad Dombrova <[email protected]>
> wrote:
> >>>>
> >>>> Hi all,
> >>>> It seems a bit unfortunate that there isn’t a portable way to
> serialize a boolean value.
> >>>>
> >>>> I’m working on porting my external PubsubIO PR over to use the
> improved schema-based external transform API in python, but because of this
> limitation I can’t use boolean values. For example, this fails:
> >>>>
> >>>> ReadFromPubsubSchema = typing.NamedTuple(
> >>>>     'ReadFromPubsubSchema',
> >>>>     [
> >>>>         ('topic', typing.Optional[unicode]),
> >>>>         ('subscription', typing.Optional[unicode]),
> >>>>         ('id_label',  typing.Optional[unicode]),
> >>>>         ('with_attributes', bool),
> >>>>         ('timestamp_attribute',  typing.Optional[unicode]),
> >>>>     ]
> >>>> )
> >>>>
> >>>> It fails because coders.get_coder(bool) returns the non-portable
> pickle coder.
> >>>>
> >>>> In the short term I can hack something into the external transform
> API to use varint coder for bools, but this kind of hacky approach to
> portability won’t work in scenarios where round-tripping is required
> without user intervention. In other words, in python it is not uncommon to
> test if x is True, in which case the integer 1 would fail this test. All of
> that is to say that a BooleanCoder would be a convenient way to ensure the
> proper type is used everywhere.
> >>>>
> >>>> So, I was just wondering why it’s not there? Are there concerns over
> whether booleans are universal enough to make part of the portability
> standard?
> >>>>
> >>>> -chad
>

Reply via email to