Hi,

Yes, if Flink does not recognize your registered state type, it will by
default use Kryo for the serialization.
And generally speaking, Kryo does not have good support for evolvable
schemas compared to other serialization frameworks such as Avro or Protobuf.

The reason why Flink defaults to Kryo for unrecognizable types has some
historical reasons due to the original use of Flink's type serialization
stack being used on the batch side, but IMO the short answer is that it
would make sense to have a different default serializer (perhaps Avro) for
snapshotting state in streaming programs.
However, I believe this would be better suited as a separate discussion
thread.

The good news is that with Flink 1.7, state schema evolution is fully
supported out of the box for Avro types, such as GenericRecord or code
generated SpecificRecords.
If you want to have evolvable schema for your state types, then it is
recommended to use Avro as state types.
Support for evolving schema of other data types such as POJOs and Scala
case classes is also on the radar for future releases.

Does this help answer your question?

By the way, the slides your are looking at I would consider quite outdated
for the topic, since Flink 1.7 was released with much smoother support for
state schema evolution.
An updated version of the slides is not yet publicly available, but if you
want I can send you one privately.
Otherwise, the Flink docs for 1.7 would also be equally helpful.

Cheers,
Gordon


On Fri, Dec 21, 2018, 8:11 PM Padarn Wilson <padarn.wil...@grab.com wrote:

> Hi all,
>
> I am trying to understand the situation with state serialization in flink.
> I'm looking at a number of sources, but slide 35 from here crystalizes my
> confusion:
>
>
> https://www.slideshare.net/FlinkForward/flink-forward-berlin-2017-tzuli-gordon-tai-managing-state-in-apache-flink
>
> So, I understand that if 'Flink's own serialization stack' is unable to
> serialize a type you define, then it will fall back on Kryo generics. In
> this case, I believe what I'm being told is that state compatibility is
> difficult to ensure, and schema evolution in your jobs is not possible.
>
> However on this slide, they say
> "
>        Kryo is generally not  recommended ...
>
>        Serialization frameworks with schema evolution support is
> recommended: Avro,         Thrift
> "
> So is this implying that Flink's non-default serialization stack does not
> support schema evolution? In this case is it best practice to register
> custom serializers whenever possible.
>
> Thanks
>
>
> *Grab is hiring. Learn more at **https://grab.careers
> <https://grab.careers/>*
>
> By communicating with Grab Inc and/or its subsidiaries, associate
> companies and jointly controlled entities (“Grab Group”), you are deemed to
> have consented to processing of your personal data as set out in the
> Privacy Notice which can be viewed at https://grab.com/privacy/
>
> This email contains confidential information and is only for the intended
> recipient(s). If you are not the intended recipient(s), please do not
> disseminate, distribute or copy this email and notify Grab Group
> immediately if you have received this by mistake and delete this email from
> your system. Email transmission cannot be guaranteed to be secure or
> error-free as any information therein could be intercepted, corrupted,
> lost, destroyed, delayed or incomplete, or contain viruses. Grab Group do
> not accept liability for any errors or omissions in the contents of this
> email arises as a result of email transmission. All intellectual property
> rights in this email and attachments therein shall remain vested in Grab
> Group, unless otherwise provided by law.
>

Reply via email to