Hi Gordon,
Thanks for driving this discussion!

I would go with the second suggestion - having two consecutive StateFun
releases 2.2.1 and 2.2.2, since the Flink-1.11.3 release
might take a while, and this hot-fix release is important enough to get out
as early as possible.

Cheers,
Igal.




On Mon, Nov 2, 2020 at 11:43 AM Tzu-Li (Gordon) Tai <tzuli...@apache.org>
wrote:

> Hi,
>
> We’re currently thinking about releasing StateFun 2.2.1, to address a
> critical bug that causes restores from checkpoints / savepoints to fail
> under certain circumstances [1].
>
> To provide a bit more context, the full fix for this issue is two-fold:
>
>    1. *Fix restoring from checkpoints / savepoints taken with the same
>    StateFun version:* this has already been fixed in StateFun, with
>    changes backported to `flink-statefun/release-2.2`.
>    2. *Allow restoring from older savepoints taken with StateFun <=
>    2.2.0:* this requires a few fixes to Flink around restoring heap-based
>    timers [2] and iterating through key groups in restored raw keyed state
>    streams [3]. These fixes will be included in Flink 1.11.3 [4], meaning that
>    to fix this, StateFun will need to wait until Flink 1.11.3 is out and
>    upgrade its Flink dependency.
>
> The main discussion point here is whether or not it makes sense for
> StateFun 2.2.1 to wait for Flink 1.11.3, so that both parts of the problems
> 1) and 2) can be solved together in a single hotfix release.
>
> The other option is to release StateFun 2.2.1 already with fixes for
> problem 1) only, and have another follow-up hotfix release 2.2.2 after
> Flink 1.11.3 is available.
>
> I propose to keep a close eye on the progress of Flink 1.11.3 (you can
> track progress on the 1.11.3 discussion thread [4]), and *make a decision
> here mid-week on Wednesday, Nov. 4th*.
> If by then we decide to not let StateFun 2.2.1 wait for Flink 1.11.3
> because it could take a while, we can start with a StateFun 2.2.1 RC right
> away; otherwise, if Flink 1.11.3 seems to be just around the corner, we can
> wait for a few more days.
>
> What do you think?
>
> Cheers,
> Gordon
>
> [1] https://issues.apache.org/jira/browse/FLINK-19692
> [2] https://github.com/apache/flink/pull/13761
> [3] https://github.com/apache/flink/pull/13772
> [4]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Apache-Flink-1-11-3-td45989.html
>

Reply via email to