I just put up a PR adding support for BOOL and BYTES in the Python RowCoder [1]. Both are standard coders implemented in Python already so it was pretty easy to add. The remaining coders tracked in BEAM-7996 will require new coder implementations in Python.
[1] https://github.com/apache/beam/pull/12324 On Fri, Jul 17, 2020 at 9:25 AM Robert Bradshaw <rober...@google.com> wrote: > On Fri, Jul 17, 2020 at 9:19 AM Piotr Szuberski > <piotr.szuber...@polidea.com> wrote: > > > > I will consider mapping KinesisRecord to Row and then sending it via > cross-language, but I think that for now python's RowCoder does not support > bytes (correct me if I'm wrong) > > Huh, looks like you're right: > > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder.py#L101 > This should be easy to add; PRs welcome! > > > On 2020/07/16 23:07:06, Luke Cwik <lc...@google.com> wrote: > > > If you want to send across a "rich" data record, consider defining a > schema > > > and using a row coder since row coder is XLang compatible. > > > > > > On Thu, Jul 16, 2020 at 9:28 AM Robert Bradshaw <rober...@google.com> > wrote: > > > > > > > Note also that once you get the Bytes in Python, you can use whatever > > > > coder (or Map) to decode them that you want. > > > > > > > > On Thu, Jul 16, 2020 at 9:21 AM Boyuan Zhang <boyu...@google.com> > wrote: > > > > > > > > > > Hi Piotr, > > > > > > > > > > X-Lang uses TypedWithoutMetadata, which outputs the KV directly > instead > > > > of KafkaRecord: see here. Given the limit that x-lang can only work > with > > > > well-known coders, if you want to process the KV in python output > from > > > > KafkaIO, the coders of key and value should be well-known in beam. By > > > > default, the key and value are bytes: see here. > > > > > > > > > > On Thu, Jul 16, 2020 at 8:48 AM Piotr Szuberski < > > > > piotr.szuber...@polidea.com> wrote: > > > > >> > > > > >> I'm writing a python wrappers for KinesisIO and I encountered a > problem > > > > that Read transform creates a PCollection with KinesisRecord class > which's > > > > coder by default is assigned as 'beam:coders:javasdk:0.1'. I managed > to > > > > register this coder using CoderTranslatorRegistrar which adds the > coder to > > > > the KNOWN_CODER_URNS and therefore is sent with my custom urn. > > > > >> > > > > >> Kafka's cross language Write transform uses KV<>, is encoded by > > > > default in beam. > > > > >> > > > > >> But I can't see how KafkaRecordCoder is translated in > cross-language > > > > usage to python? I can't see any place in code where it gets > registered. > > > > >> > > > > >> I just don't get how KafkaIO.Read works in cross-language. Could > > > > someone clarify me how does it work? > > > > >> > > > > >> Thanks in advance! > > > > > > > >