+dev
> On Aug 30, 2022, at 11:20 AM, Rion Williams <rionmons...@gmail.com> wrote: > > > Hi all, > > I wasn't sure if this would be the best audience, if not, please advise if > you know of a better place to ask it. I figured that at least some folks here > either work for Ververica or might have used their platform. > > tl;dr; I'm trying to migrate an existing stateful Flink job to run in > Ververica Platform (Community) and I'm noticing that it doesn't seem that all > of the state is being properly handed off (only _metadata). > > I'm currently in the process of migrating an existing Flink job that is > running in Kubernetes on its own to run within the Ververica platform. The > issue here is that the job itself is stateful, so I want to ensure I can > migrate over that state so when the new job kicks off, it's a fairly seamless > transition. > > Basically, what I've done up to this point is create a script as part of the > Ververica platform deployment that will: > Check for the existence of any of the known jobs that have been migrated. > If one is found, it will stop the job, taking a full savepoint, and store the > savepoint path within a configmap for that job used solely for migration > purposes. > If one is not found, it will assume the job has been migrated. > Create a Deployment for each of the new jobs, pointing to the appropriate > configuration, jars, etc. > Check for the presence of one of the previous migration configmaps and issue > a request to create a savepoint for that deployment. > This involves using the Ververica REST API to grab the appropriate deployment > information and issuing a request to the Savepoints endpoint of the same REST > API to "add" the savepoint. > I've confirmed the above "works" and indeed stops any legacy jobs, creates > the resources (i.e. configmaps) used for the migration, starts up the new job > within Ververica and I can see evidence within the UI that a savepoint was > "COPIED" for that deployment. > > However, when comparing (in GCS) the previous savepoint for the old job and > the one now managed by Ververica for the job, I notice that the new one only > contains a single _metadata file: > > > > Whereas the previous contained a metadata file and another related data file: > > > This leads me to believe that the new job might not know about any items > previously stored in state, which could be problematic. > > When reviewing over the documentation for "manually adding a savepoint" for > Ververica Platform 2.6, I noticed that the payload to the Savepoints endpoint > looked like the following, which was what I used: > metadata: > deploymentId: ${deploymentId} > annotations: > com.dataartisans.appmanager.controller.deployment.spec.version: > ${deploymentSpecVersion} > type: ${type} (used FULL in my case) > spec: > savepointLocation: ${savepointLocation} > flinkSavepointId: 00000000-0000-0000-0000-000000000000 > status: > state: COMPLETED > > The empty UUID was a bit concerning and I was curious if that might be the > reason my additional data files didn't come across from the savepoint as well > (I noticed in 2.7 this is an optional argument in the payload). I don't see > much more for any additional configuration that would otherwise specify to > pull everything including _metadata. > > Any ideas or guidance would be helpful. > > Rion > > > >