Hi Gyula, Is your 1.3 savepoint from Flink 1.3.1 or 1.3.0? In those versions, we had a critical bug that caused duplicate partition assignments in corner cases, so the assignment logic was altered from 1.3.1 to 1.3.2 (and therefore also 1.4.0).
If you indeed was using 1.3.1 or 1.3.0, and you are certain that the savepoint does not contain duplicate partition assignments caused by the bug, then yes restoring with DOP 1 and then rescaling again is a good workaround. Please see the 1.3.2 release announcement [1] for details. Best, Gordon [1] http://flink.apache.org/news/2017/08/05/release-1.3.2.html On Jan 8, 2018 6:57 AM, "Gyula Fóra" <gyula.f...@gmail.com> wrote: Migrating the jobs by setting the sources to parallelism = 1 and then scale back up after migration seems to be a good workaround, but I am wondering if something I do made this happen or this is a bug. Gyula Fóra <gyula.f...@gmail.com> ezt írta (időpont: 2018. jan. 8., H, 14:46): > Hi, > > Is it possible that the Kafka partition assignment logic has changed > between Flink 1.3 and 1.4? I am trying to migrate some jobs using Kafka > 0.8 sources and about half my jobs lost offset state for some partitions > (but not all partitions). Jobs with parallelism 1 dont seem to be > affected... > > Any ideas? > > Gyula >