tzulitai edited a comment on pull request #13773: URL: https://github.com/apache/flink/pull/13773#issuecomment-717730004
@Antti-Kaikkonen thanks you for trying the branch out. I think the exceptions you encountered are expected in the experiments you've tried out. Can you adjust your experiments to do the following, and then report back again?: Try out your application `FlinkStatefunCountTo1M` with a new build of StateFun that includes the changes in https://github.com/apache/flink-statefun/pull/168? You should be able to just pull that branch, do a clean build (`mvn clean install -DskipTests`), and then change the StateFun dependency in your application to `2.3-SNAPSHOT`. You should create a savepoint, and try to restore as you did in your previous test. Note that you should not need to apply any Flink fixes for this. --- Let me briefly explain our release plans here to address the issue you reported, and why the above adjustment makes sense: 1. With the StateFun changes in https://github.com/apache/flink-statefun/pull/168 (and not including ANY Flink changes), we're expecting that restoring from checkpoints / savepoints should work properly now for all checkpoints / savepoints taken with a new version that includes https://github.com/apache/flink-statefun/pull/168. This would already address FLINK-19692, and we're planning to push out a StateFun hotfix release immediately to unblock you and other users that may be encountering the same issue. 2. What https://github.com/apache/flink-statefun/pull/168 doesn't yet solve, is the ability to safely restore / upgrade from a savepoint taken with StateFun versions <= 2.2.0. This does not affect you if you don't have StateFun applications running in production yet. Enabling this requires this PR and #13761 to be fixed in Flink, release a new Flink version, and ultimately yet another follow-up StateFun hotfix releases that uses the new Flink version. That is a lengthier process, with an estimate of another 3-4 weeks, so we decided to go ahead with the above option first to move faster. --- TL;DR: It would be tremendously helpful if you can re-do your experiment only with a new StateFun build including https://github.com/apache/flink-statefun/pull/168 alone. Please do let me know of the results! ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org