Re: What are savepoint state manipulation support plans

2019-05-29 Thread Tzu-Li (Gordon) Tai
FYI: Seth starting a FLIP for adding a savepoint connector that addresses
this -
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discuss-FLIP-43-Savepoint-Connector-td29233.html

Please join the discussion there if you are interested!

On Thu, Mar 28, 2019 at 5:23 PM Tzu-Li (Gordon) Tai 
wrote:

> @Ufuk
>
> Yes, creating a JIRA now already to track this makes sense.
>
> I've proceeded to open one:
> https://issues.apache.org/jira/browse/FLINK-12047
> Let's move any further discussions there.
>
> Cheers,
> Gordon
>
> On Thu, Mar 28, 2019 at 5:01 PM Ufuk Celebi  wrote:
>
>> I think such a tool would be really valuable to users.
>>
>> @Gordon: What do you think about creating an umbrella ticket for this
>> and linking it in this thread? That way, it's easier to follow this
>> effort. You could also link Bravo and Seth's tool in the ticket as
>> starting points.
>>
>> – Ufuk
>>
>


Re: What are savepoint state manipulation support plans

2019-03-28 Thread Ufuk Celebi
Thanks Gordon. We already have 5 people watching it. :-)

On Thu, Mar 28, 2019 at 10:23 AM Tzu-Li (Gordon) Tai
 wrote:
>
> @Ufuk
>
> Yes, creating a JIRA now already to track this makes sense.
>
> I've proceeded to open one:  https://issues.apache.org/jira/browse/FLINK-12047
> Let's move any further discussions there.
>
> Cheers,
> Gordon
>
> On Thu, Mar 28, 2019 at 5:01 PM Ufuk Celebi  wrote:
>>
>> I think such a tool would be really valuable to users.
>>
>> @Gordon: What do you think about creating an umbrella ticket for this
>> and linking it in this thread? That way, it's easier to follow this
>> effort. You could also link Bravo and Seth's tool in the ticket as
>> starting points.
>>
>> – Ufuk


Re: What are savepoint state manipulation support plans

2019-03-28 Thread Tzu-Li (Gordon) Tai
@Ufuk

Yes, creating a JIRA now already to track this makes sense.

I've proceeded to open one:
https://issues.apache.org/jira/browse/FLINK-12047
Let's move any further discussions there.

Cheers,
Gordon

On Thu, Mar 28, 2019 at 5:01 PM Ufuk Celebi  wrote:

> I think such a tool would be really valuable to users.
>
> @Gordon: What do you think about creating an umbrella ticket for this
> and linking it in this thread? That way, it's easier to follow this
> effort. You could also link Bravo and Seth's tool in the ticket as
> starting points.
>
> – Ufuk
>


Re: What are savepoint state manipulation support plans

2019-03-28 Thread Vishal Santoshi
+1

On Thu, Mar 28, 2019, 5:01 AM Ufuk Celebi  wrote:

> I think such a tool would be really valuable to users.
>
> @Gordon: What do you think about creating an umbrella ticket for this
> and linking it in this thread? That way, it's easier to follow this
> effort. You could also link Bravo and Seth's tool in the ticket as
> starting points.
>
> – Ufuk
>


Re: What are savepoint state manipulation support plans

2019-03-28 Thread Ufuk Celebi
I think such a tool would be really valuable to users.

@Gordon: What do you think about creating an umbrella ticket for this
and linking it in this thread? That way, it's easier to follow this
effort. You could also link Bravo and Seth's tool in the ticket as
starting points.

– Ufuk


Re: What are savepoint state manipulation support plans

2019-03-28 Thread Tzu-Li (Gordon) Tai
Hi!

Regarding the support for savepoint reading / writing / processing directly
in core Flink, we've been thinking about that lately and might push a bit
for adding the functionality to Flink in the next release.
For example, beside Bravo, Seth (CC'ed) also had implemented something [1]
for this. We should start thinking about converging the efforts of similar
tools and supporting it in Flink soon.
There's no official JIRA / feature proposal for this yet, but if you're
interested, please keep an eye on the dev mailing list for it in the future.

Cheers,
Gordon

[1] https://github.com/sjwiesman/flink/tree/savepoint-connector

On Thu, Mar 28, 2019 at 4:26 PM Gyula Fóra  wrote:

> Hi!
>
> I dont think there is any ongoing effort in core Flink other than this
> library we created.
>
> You are probably right that it is pretty hacky at the moment. I would say
> this one way we could do it that seemed convenient to me at the time I have
> written the code.
>
> If you have ideas how to structure it better or improve it, you know
> where to find the code, feel free to open a PR :) That might actually takes
> us closer to having this properly in flink one day soon.
>
> Just to clarify the code you are showing:
> writer.writeAll() -> Runs the batch job that writes the checkpoint files
> for the changed operator states, returns the reference to the OperatorState
> metadata object
> StateMetadataUtils.createNewSavepoint() -> Replaces the metadata for the
> operator states you have just written in the previous savepoint
> StateMetadataUtils.writeSavepointMetadata() -> Writes a new metadata file
>
> So metadata writing happens as the very last step after the batch job has
> run. This is similar to how it works in streaming jobs in the sense there
> the jobmanager writes the metafile after the checkpointing is done. The
> downside of this approach is that the client might not have access to write
> the metafile here.
>
> Gyula
>
>
>


Re: What are savepoint state manipulation support plans

2019-03-28 Thread Gyula Fóra
Hi!

I dont think there is any ongoing effort in core Flink other than this
library we created.

You are probably right that it is pretty hacky at the moment. I would say
this one way we could do it that seemed convenient to me at the time I have
written the code.

If you have ideas how to structure it better or improve it, you know
where to find the code, feel free to open a PR :) That might actually takes
us closer to having this properly in flink one day soon.

Just to clarify the code you are showing:
writer.writeAll() -> Runs the batch job that writes the checkpoint files
for the changed operator states, returns the reference to the OperatorState
metadata object
StateMetadataUtils.createNewSavepoint() -> Replaces the metadata for the
operator states you have just written in the previous savepoint
StateMetadataUtils.writeSavepointMetadata() -> Writes a new metadata file

So metadata writing happens as the very last step after the batch job has
run. This is similar to how it works in streaming jobs in the sense there
the jobmanager writes the metafile after the checkpointing is done. The
downside of this approach is that the client might not have access to write
the metafile here.

Gyula


What are savepoint state manipulation support plans

2019-03-27 Thread Sergei Poganshev
What are the plans to support savepoint state manipulation with batch jobs
natively in core Flink?

I've tried using the bravo tool [1]. It's pretty good at reading
savepoints, but writing seems hacky. For example I wonder what exactly
happens with the following lines:

val newOpState = writer.writeAll()
val newSavepoint = StateMetadataUtils.createNewSavepoint(savepoint, newOpState)
StateMetadataUtils.writeSavepointMetadata(savepointDir, newSavepoint)

Does it actually wait for the batch job to finish all its tasks and writes
metadata file then? I'm asking because this code didn't execute at all when
I tried to run it in k8s environment with a standalone-job.sh setup. (i.e.
the _metadata file did not get created)


[1] https://github.com/king/bravo