SmedbergM created FLINK-33062:
---------------------------------
Summary: Deserialization creates multiple instances of case
objects in Scala 2.13
Key: FLINK-33062
URL: https://issues.apache.org/jira/browse/FLINK-33062
Project: Flink
Issue Type: Bug
Components: API / Core
Affects Versions: 1.17.1, 1.15.4
Environment: Scala 2.13.12
Flink 1.17.1 and 1.15.4 running inside IntelliJ
Flink 1.15.4 running in AWS Managed Flink
Reporter: SmedbergM
See [https://github.com/SmedbergM/mwe-flink-2-13-deserialization] for a minimal
working example.
When running a Flink job with Scala 2.13, deserialized objects whose fields are
case objects have those case objects re-instantiated. Thus any code that relies
on reference equality (such as methods of `scala.Option`) will break.
I suspect that this is due to Kyro deserialization not being singleton-aware
for case objects, but I haven't been able to drill in and catch this in the act.
Here are relevant lines of my application log:
{code:java}
17:37:13.224 [jobmanager-io-thread-1] INFO o.a.f.r.c.CheckpointCoordinator --
No checkpoint found during restore.
17:37:13.531 [parse-book -> Sink: log-book (2/2)#0] WARN
smedbergm.mwe.BookSink$ -- Book.isbn: None with identityHashCode 1043314405
reports isEmpty false; true None is 2019204827
17:37:13.531 [parse-book -> Sink: log-book (1/2)#0] INFO
smedbergm.mwe.BookSink$ -- Winkler, Scott: Terraform In Action (ISBN
978-1-61729-689-5) $49.99
17:37:13.534 [parse-book -> Sink: log-book (2/2)#0] WARN
smedbergm.mwe.BookSink$ -- Book.isbn: None with identityHashCode 465138693
reports isEmpty false; true None is 2019204827
17:37:13.534 [parse-book -> Sink: log-book (1/2)#0] INFO
smedbergm.mwe.BookSink$ -- Čukić, Ivan: Functional Programming in C++ (ISBN
978-1-61729-381-8) $49.99
17:37:13.538 [flink-akka.actor.default-dispatcher-8] INFO
o.a.f.r.c.CheckpointCoordinator -- Stopping checkpoint coordinator for job
eed1c049790ac5f38664ddfd6b049282. {code}
I know that https://issues.apache.org/jira/browse/FLINK-13414 (support for
Scala 2.13) is still listed as in-progress, but there is no warning in the docs
that using 2.13 might not be stable. (This particular error does not occur on
Scala 2.12, in this case because Option.isEmpty was re-implemented in 2.13;
however, I suspect that multiple deserialization may be occurring already in
2.12.)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)