Hi, You can create a PullRequest, I will do the review.
The coder is set on the RabbitMQIO PTransform, so, it should work. AFAIR, we have a Jira about that and I already started to check. Not yet completed yet. Regards JB On 24/05/2019 11:01, Nicolas Delsaux wrote: > Hi all > > I'm currently evaluationg Apache Beam to transfer messages from RabbitMq > to kafka with some transform in between. > > Doing, so, i've discovered some differences between direct runner > behaviour and Google Dataflow runner. > > But first, a small introduction to what I know. > > From what I understand, elements transmitted between two different > transforms are serialized/deserialized. > > This (de)serialization process is normally managed by Coder, in which > the most used is obviously the Serializablecoder, which takes a > serializable object and (de)serialize it using classical java mechanisms. > > On direct runner, i had issues with rabbitMq messages, as they contain > in their headers objects that are LongString, an interface implemented > solely in a private static class of RabbitMq, and used for large text > messages. > > So I wrote a RabbitMqMessageCoder, and installed it in my pipeline > (using > pipeline.getCoderregistry().registerCoderForClass(RabbitMqMessage.class, > new MyCoder()) > > And it worked ! well, not in dataflow runner. > > > indeed, it seems like dataflow runner don't use this coder registry > mechanism (for reasons I absolutely don't understand). > > So my fix didn't work. > > After various tries, I finally gave up and directly modified the > RabbitMqIO class (from Apache Beam) to handle my case. > > This fix is available on my Beam fork on GitHub, and i would like to > have it integrated. > > What is the procedure to do so ? > > Thanks ! > -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com