Hi Piotr, X-Lang uses TypedWithoutMetadata <https://github.com/apache/beam/blob/d957ce480604be623b0df77e62948c3e0d2c2453/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L1024>, which outputs the KV directly instead of KafkaRecord: see here <https://github.com/apache/beam/blob/d957ce480604be623b0df77e62948c3e0d2c2453/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L467>. Given the limit that x-lang can only work with well-known coders, if you want to process the KV in python output from KafkaIO, the coders of key and value should be well-known in beam. By default, the key and value are bytes: see here <https://github.com/apache/beam/blob/d957ce480604be623b0df77e62948c3e0d2c2453/sdks/python/apache_beam/io/kafka.py#L117> .
On Thu, Jul 16, 2020 at 8:48 AM Piotr Szuberski <piotr.szuber...@polidea.com> wrote: > I'm writing a python wrappers for KinesisIO and I encountered a problem > that Read transform creates a PCollection with KinesisRecord class which's > coder by default is assigned as 'beam:coders:javasdk:0.1'. I managed to > register this coder using CoderTranslatorRegistrar which adds the coder to > the KNOWN_CODER_URNS and therefore is sent with my custom urn. > > Kafka's cross language Write transform uses KV<>, is encoded by default > in beam. > > But I can't see how KafkaRecordCoder is translated in cross-language usage > to python? I can't see any place in code where it gets registered. > > I just don't get how KafkaIO.Read works in cross-language. Could someone > clarify me how does it work? > > Thanks in advance! >