Hi Piotr,

X-Lang uses TypedWithoutMetadata
<https://github.com/apache/beam/blob/d957ce480604be623b0df77e62948c3e0d2c2453/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L1024>,
which outputs the KV directly instead of KafkaRecord: see here
<https://github.com/apache/beam/blob/d957ce480604be623b0df77e62948c3e0d2c2453/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L467>.
Given the limit that x-lang can only work with well-known coders, if you
want to process the KV in python output from KafkaIO, the coders of key and
value should be well-known in beam. By default, the key and value are
bytes: see here
<https://github.com/apache/beam/blob/d957ce480604be623b0df77e62948c3e0d2c2453/sdks/python/apache_beam/io/kafka.py#L117>
.

On Thu, Jul 16, 2020 at 8:48 AM Piotr Szuberski <piotr.szuber...@polidea.com>
wrote:

> I'm writing a python wrappers for KinesisIO and I encountered a problem
> that Read transform creates a PCollection with KinesisRecord class which's
> coder by default is assigned as 'beam:coders:javasdk:0.1'. I managed to
> register this coder using CoderTranslatorRegistrar which adds the coder to
> the KNOWN_CODER_URNS and therefore is sent with my custom urn.
>
> Kafka's cross language Write transform uses KV<>, is  encoded by default
> in beam.
>
> But I can't see how KafkaRecordCoder is translated in cross-language usage
> to python? I can't see any place in code where it gets registered.
>
> I just don't get how KafkaIO.Read works in cross-language. Could someone
> clarify me how does it work?
>
> Thanks in advance!
>

Reply via email to