Hi Nicolas,

Thank you for your contribution. I’d recommend you to start from this page:
https://beam.apache.org/contribute/#share-your-intent 
<https://beam.apache.org/contribute/#share-your-intent>

I’ll be glad to review your PR once it will be ready.

> On 24 May 2019, at 11:01, Nicolas Delsaux <nicolas.dels...@gmx.fr> wrote:
> 
> Hi all
> 
> I'm currently evaluationg Apache Beam to transfer messages from RabbitMq
> to kafka with some transform in between.
> 
> Doing, so, i've discovered some differences between direct runner
> behaviour and Google Dataflow runner.
> 
> But first, a small introduction to what I know.
> 
> From what I understand, elements transmitted between two different
> transforms are serialized/deserialized.
> 
> This (de)serialization process is normally managed by Coder, in which
> the most used is obviously the Serializablecoder, which takes a
> serializable object and (de)serialize it using classical java mechanisms.
> 
> On direct runner, i had issues with rabbitMq messages, as they contain
> in their headers objects that are LongString, an interface implemented
> solely in a private static class of RabbitMq, and used for large text
> messages.
> 
> So I wrote a RabbitMqMessageCoder, and installed it in my pipeline
> (using
> pipeline.getCoderregistry().registerCoderForClass(RabbitMqMessage.class,
> new MyCoder())
> 
> And it worked ! well, not in dataflow runner.
> 
> 
> indeed, it seems like dataflow runner don't use this coder registry
> mechanism (for reasons I absolutely don't understand).
> 
> So my fix didn't work.
> 
> After various tries, I finally gave up and directly modified the
> RabbitMqIO class (from Apache Beam) to handle my case.
> 
> This fix is available on my Beam fork on GitHub, and i would like to
> have it integrated.
> 
> What is the procedure to do so ?
> 
> Thanks !
> 

Reply via email to