Hi Matt, 1. There’s some in-progress work on wrapper util classes for Kafka de/serializers here [1] that allows Kafka de/serializers to be used with the Flink Kafka Consumers/Producers with minimal user overhead. The PR also has some proposed adds to the documentations for the wrappers.
2. I feel that it would be good to have more documentation on Flink’s de/serializers because they’ve been frequently asked about on the mailing lists, but at the same time, probably the fastest / efficient de/serialization approach would be tailored for each use case, so we’d need to think more on the presentation and the purpose of the documentation. Cheers, Gordon [1] https://github.com/apache/flink/pull/2705 On December 8, 2016 at 5:00:19 AM, milind parikh (milindspar...@gmail.com) wrote: Why not use a self-describing format (json), stream as String and read through a json reader and avoid top-level reflection? Github.com/milindparikh/streamingsi https://github.com/milindparikh/streamingsi/tree/master/epic-poc/sprint-2-simulated-data-no-cdc-advanced-eventing/2-dataprocessing ? Apologies if I misunderstood the question. But I can quite see how to model your Product class (or indeed POJO) in a fairly generic way ( assumes JSON). The real issues faced when you have different versions of same POJO class requires storing enough information to dynamically instantiate the actual version of the class; which I believe is beyond the simple use case. Milind On Dec 7, 2016 2:42 PM, "Matt" <dromitl...@gmail.com> wrote: I've read your example, but I've found the same problem. You're serializing your POJO as a string, where all fields are separated by "\t". This may work for you, but not in general. https://github.com/luigiselmi/pilot-sc4-fcd-producer/blob/master/src/main/java/eu/bde/sc4pilot/fcd/FcdTaxiEvent.java#L60 I would like to see a more "generic" approach for the class Product in my last message. I believe a more general purpose de/serializer for POJOs should be possible to achieve using reflection. On Wed, Dec 7, 2016 at 1:16 PM, Luigi Selmi <luigise...@gmail.com> wrote: Hi Matt, I had the same problem, trying to read some records in event time using a POJO, doing some transformation and save the result into Kafka for further processing. I am not yet done but maybe the code I wrote starting from the Flink Forward 2016 training docs could be useful. https://github.com/luigiselmi/pilot-sc4-fcd-producer Best, Luigi On 7 December 2016 at 16:35, Matt <dromitl...@gmail.com> wrote: Hello, I don't quite understand how to integrate Kafka and Flink, after a lot of thoughts and hours of reading I feel I'm still missing something important. So far I haven't found a non-trivial but simple example of a stream of a custom class (POJO). It would be good to have such an example in Flink docs, I can think of many many scenarios in which using SimpleStringSchema is not an option, but all Kafka+Flink guides insist on using that. Maybe we can add a simple example to the documentation [1], it would be really helpful for many of us. Also, explaining how to create a Flink De/SerializationSchema from a Kafka De/Serializer would be really useful and would save a lot of time to a lot of people, it's not clear why you need both of them or if you need both of them. As far as I know Avro is a common choice for serialization, but I've read Kryo's performance is much better (true?). I guess though that the fastest serialization approach is writing your own de/serializer. 1. What do you think about adding some thoughts on this to the documentation? 2. Can anyone provide an example for the following class? --- public class Product { public String code; public double price; public String description; public long created; } --- Regards, Matt [1] http://data-artisans.com/kafka-flink-a-practical-how-to/ -- Luigi Selmi, M.Sc. Fraunhofer IAIS Schloss Birlinghoven . 53757 Sankt Augustin, Germany Phone: +49 2241 14-2440