Hi Matt,

1. There’s some in-progress work on wrapper util classes for Kafka 
de/serializers here [1] that allows
Kafka de/serializers to be used with the Flink Kafka Consumers/Producers with 
minimal user overhead.
The PR also has some proposed adds to the documentations for the wrappers.

2. I feel that it would be good to have more documentation on Flink’s 
de/serializers because they’ve been
frequently asked about on the mailing lists, but at the same time, probably the 
fastest / efficient de/serialization
approach would be tailored for each use case, so we’d need to think more on the 
presentation and the purpose
of the documentation.

Cheers,
Gordon

[1] https://github.com/apache/flink/pull/2705

On December 8, 2016 at 5:00:19 AM, milind parikh (milindspar...@gmail.com) 
wrote:

Why not use a self-describing format  (json), stream as String and read through 
a json reader and avoid top-level reflection?

Github.com/milindparikh/streamingsi

https://github.com/milindparikh/streamingsi/tree/master/epic-poc/sprint-2-simulated-data-no-cdc-advanced-eventing/2-dataprocessing

?

Apologies if I misunderstood the question. But I can quite see how to model 
your Product class (or indeed POJO) in a fairly generic way ( assumes JSON).

The real issues faced when you have different versions of same POJO class 
requires storing enough information to dynamically instantiate the actual 
version of the class; which I believe is beyond the simple use case.

Milind

On Dec 7, 2016 2:42 PM, "Matt" <dromitl...@gmail.com> wrote:
I've read your example, but I've found the same problem. You're serializing 
your POJO as a string, where all fields are separated by "\t". This may work 
for you, but not in general.

https://github.com/luigiselmi/pilot-sc4-fcd-producer/blob/master/src/main/java/eu/bde/sc4pilot/fcd/FcdTaxiEvent.java#L60

I would like to see a more "generic" approach for the class Product in my last 
message. I believe a more general purpose de/serializer for POJOs should be 
possible to achieve using reflection.

On Wed, Dec 7, 2016 at 1:16 PM, Luigi Selmi <luigise...@gmail.com> wrote:
Hi Matt,

I had the same problem, trying to read some records in event time using a POJO, 
doing some transformation and save the result into Kafka for further 
processing. I am not yet done but maybe the code I wrote starting from the 
Flink Forward 2016 training docs could be useful.

https://github.com/luigiselmi/pilot-sc4-fcd-producer


Best,

Luigi 

On 7 December 2016 at 16:35, Matt <dromitl...@gmail.com> wrote:
Hello,

I don't quite understand how to integrate Kafka and Flink, after a lot of 
thoughts and hours of reading I feel I'm still missing something important.

So far I haven't found a non-trivial but simple example of a stream of a custom 
class (POJO). It would be good to have such an example in Flink docs, I can 
think of many many scenarios in which using SimpleStringSchema is not an 
option, but all Kafka+Flink guides insist on using that.

Maybe we can add a simple example to the documentation [1], it would be really 
helpful for many of us. Also, explaining how to create a Flink 
De/SerializationSchema from a Kafka De/Serializer would be really useful and 
would save a lot of time to a lot of people, it's not clear why you need both 
of them or if you need both of them.

As far as I know Avro is a common choice for serialization, but I've read 
Kryo's performance is much better (true?). I guess though that the fastest 
serialization approach is writing your own de/serializer.

1. What do you think about adding some thoughts on this to the documentation?
2. Can anyone provide an example for the following class?

---
public class Product {
    public String code;
    public double price;
    public String description;
    public long created;
}
---

Regards,
Matt

[1] http://data-artisans.com/kafka-flink-a-practical-how-to/



--
Luigi Selmi, M.Sc.
Fraunhofer IAIS Schloss Birlinghoven . 
53757 Sankt Augustin, Germany
Phone: +49 2241 14-2440


Reply via email to