Hi people, This is what I was talking about regarding a generic de/serializer for POJO classes [1].
The Serde class in [2] can be used in both Kafka [3] and Flink [4], and it works out of the box for any POJO class. Do you see anything wrong in this approach? Any way to improve it? Cheers, Matt [1] https://github.com/Dromit/StreamTest/ [2] https://github.com/Dromit/StreamTest/blob/master/src/main/java/com/stream/Serde.java [3] https://github.com/Dromit/StreamTest/blob/master/src/main/java/com/stream/MainProducer.java#L19 [4] https://github.com/Dromit/StreamTest/blob/master/src/main/java/com/stream/MainConsumer.java#L19 On Thu, Dec 8, 2016 at 4:15 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote: > Hi Matt, > > 1. There’s some in-progress work on wrapper util classes for Kafka > de/serializers here [1] that allows > Kafka de/serializers to be used with the Flink Kafka Consumers/Producers > with minimal user overhead. > The PR also has some proposed adds to the documentations for the wrappers. > > 2. I feel that it would be good to have more documentation on Flink’s > de/serializers because they’ve been > frequently asked about on the mailing lists, but at the same time, > probably the fastest / efficient de/serialization > approach would be tailored for each use case, so we’d need to think more > on the presentation and the purpose > of the documentation. > > Cheers, > Gordon > > [1] https://github.com/apache/flink/pull/2705 > > On December 8, 2016 at 5:00:19 AM, milind parikh (milindspar...@gmail.com) > wrote: > > Why not use a self-describing format (json), stream as String and read > through a json reader and avoid top-level reflection? > > Github.com/milindparikh/streamingsi > > https://github.com/milindparikh/streamingsi/tree/master/epic-poc/sprint-2- > simulated-data-no-cdc-advanced-eventing/2-dataprocessing > > ? > > Apologies if I misunderstood the question. But I can quite see how to > model your Product class (or indeed POJO) in a fairly generic way ( assumes > JSON). > > The real issues faced when you have different versions of same POJO class > requires storing enough information to dynamically instantiate the actual > version of the class; which I believe is beyond the simple use case. > > Milind > On Dec 7, 2016 2:42 PM, "Matt" <dromitl...@gmail.com> wrote: > >> I've read your example, but I've found the same problem. You're >> serializing your POJO as a string, where all fields are separated by "\t". >> This may work for you, but not in general. >> >> https://github.com/luigiselmi/pilot-sc4-fcd-producer/blob/ma >> ster/src/main/java/eu/bde/sc4pilot/fcd/FcdTaxiEvent.java#L60 >> >> I would like to see a more "generic" approach for the class Product in my >> last message. I believe a more general purpose de/serializer for POJOs >> should be possible to achieve using reflection. >> >> On Wed, Dec 7, 2016 at 1:16 PM, Luigi Selmi <luigise...@gmail.com> wrote: >> >>> Hi Matt, >>> >>> I had the same problem, trying to read some records in event time using >>> a POJO, doing some transformation and save the result into Kafka for >>> further processing. I am not yet done but maybe the code I wrote starting >>> from the Flink Forward 2016 training docs >>> <http://dataartisans.github.io/flink-training/exercises/popularPlaces.html> >>> could be useful. >>> >>> https://github.com/luigiselmi/pilot-sc4-fcd-producer >>> >>> >>> Best, >>> >>> Luigi >>> >>> On 7 December 2016 at 16:35, Matt <dromitl...@gmail.com> wrote: >>> >>>> Hello, >>>> >>>> I don't quite understand how to integrate Kafka and Flink, after a lot >>>> of thoughts and hours of reading I feel I'm still missing something >>>> important. >>>> >>>> So far I haven't found a non-trivial but simple example of a stream of >>>> a custom class (POJO). It would be good to have such an example in Flink >>>> docs, I can think of many many scenarios in which using SimpleStringSchema >>>> is not an option, but all Kafka+Flink guides insist on using that. >>>> >>>> Maybe we can add a simple example to the documentation [1], it would be >>>> really helpful for many of us. Also, explaining how to create a Flink >>>> De/SerializationSchema from a Kafka De/Serializer would be really useful >>>> and would save a lot of time to a lot of people, it's not clear why you >>>> need both of them or if you need both of them. >>>> >>>> As far as I know Avro is a common choice for serialization, but I've >>>> read Kryo's performance is much better (true?). I guess though that the >>>> fastest serialization approach is writing your own de/serializer. >>>> >>>> 1. What do you think about adding some thoughts on this to the >>>> documentation? >>>> 2. Can anyone provide an example for the following class? >>>> >>>> --- >>>> public class Product { >>>> public String code; >>>> public double price; >>>> public String description; >>>> public long created; >>>> } >>>> --- >>>> >>>> Regards, >>>> Matt >>>> >>>> [1] http://data-artisans.com/kafka-flink-a-practical-how-to/ >>>> >>> >>> >>> >>> -- >>> Luigi Selmi, M.Sc. >>> Fraunhofer IAIS Schloss Birlinghoven . >>> 53757 Sankt Augustin, Germany >>> Phone: +49 2241 14-2440 <+49%202241%20142440> >>> >>> >>