Re: [DISCUSS] PIP-4 Support savepoint

yu zelin Wed, 24 May 2023 02:19:18 -0700

Hi, Guojun,

I’d like to share my thoughts about your questions.


1. Expiration of savepoint
In my opinion, savepoints are created in a long interval, so there will not 
exist too many of them.
If users create a savepoint per day, there are 365 savepoints a year. So I 
didn’t consider expiration 
of it, and I think provide a flink action like `delete-savepoint id = 1` is 
enough now. 
But if it is really important, we can introduce table options to do so. I think 
we can do it like expiring 
snapshots.

2. >   id of compacted snapshot picked by the savepoint
My initial idea is picking a compacted snapshot or doing compaction before 
creating savepoint. But 
After discuss with Jingsong, I found it’s difficult. So now I suppose to 
directly create savepoint from 
the given snapshot. Maybe we can optimize it later.
The changes will be updated soon.
> manifest file list in system-table
I think manifest file is not very important for users. Users can find when a 
savepoint is created, and 
get the savepoint id, then they can query it from the savepoint by the id. I 
did’t see what scenario 
the users need the manifest file information. What do you think?

Best, 
Yu Zelin

> 2023年5月24日 10:50，Guojun Li <[email protected]> 写道：
> 
> Thanks zelin for bringing up the discussion. I'm thinking about:
> 1. How to manage the savepoints if there are no expiration mechanism, by
> the TTL management of storages or external script?
> 2. I think the id of compacted snapshot picked by the savepoint and
> manifest file list is also important information for users, could these
> information be stored in the system-table?
> 
> Best,
> Guojun
> 
> On Mon, May 22, 2023 at 9:13 PM Jingsong Li <[email protected]> wrote:
> 
>> FYI
>> 
>> The PIP lacks a table to show Discussion thread & Vote thread & ISSUE...
>> 
>> Best
>> Jingsong
>> 
>> On Mon, May 22, 2023 at 4:48 PM yu zelin <[email protected]> wrote:
>>> 
>>> Hi, all,
>>> 
>>> Thank all of you for your suggestions and questions. After reading your
>> suggestions, I adopt some of them and I want to share my opinions here.
>>> 
>>> To make my statements more clear, I will still use the word `savepoint`.
>> When we make a consensus, the name may be changed.
>>> 
>>> 1. The purposes of savepoint
>>> 
>>> As Shammon mentioned, Flink and database also have the concept of
>> `savepoint`. So it’s better to clarify the purposes of our savepoint.
>> Thanks for Nicholas and Jingsong, I think your explanations are very clear.
>> I’d like to give my summary:
>>> 
>>> (1) Fault recovery (or we can say disaster recovery). Users can ROLL
>> BACK to a savepoint if needed. If user rollbacks to a savepoint, the table
>> will hold the data in the savepoint and the data committed  after the
>> savepoint will be deleted. In this scenario we need savepoint because
>> snapshots may have expired, the savepoint can keep longer and save user’s
>> old data.
>>> 
>>> (2) Record versions of data at a longer interval (typically daily level
>> or weekly level). With savepoint, user can query the old data in batch
>> mode. Comparing to copy records to a new table or merge incremental records
>> with old records (like using merge into in Hive), the savepoint is more
>> lightweight because we don’t copy data files, we just record the meta data
>> of them.
>>> 
>>> As you can see, savepoint is very similar to snapshot. The differences
>> are:
>>> 
>>> (1) Savepoint lives longer. In most cases, snapshot’s life time is
>> about several minutes to hours. We suppose the savepoint can live several
>> days, weeks, or even months.
>>> 
>>> (2) Savepoint is mainly used for batch reading for historical data. In
>> this PIP, we don’t introduce streaming reading for savepoint.
>>> 
>>> 2. Candidates of name
>>> 
>>> I agree with Jingsong that we can use a new name. Since the purpose and
>> mechanism (savepoint is very similar to snapshot) of savepoint is similar
>> to `tag` in iceberg, maybe we can use `tag`.
>>> 
>>> In my opinion, an alternative is `anchor`. All the snapshots are like
>> the navigation path of the streaming data, and an `anchor` can stop it in a
>> place.
>>> 
>>> 3. Public table operations and options
>>> 
>>> We supposed to expose some operations and table options for user to
>> manage the savepoint.
>>> 
>>> (1) Operations (Currently for Flink)
>>> We provide flink actions to manage savepoints:
>>>    create-savepoint: To generate a savepoint from latest snapshot.
>> Support to create from specified snapshot.
>>>    delete-savepoint: To delete specified savepoint.
>>>    rollback-to: To roll back to a specified savepoint.
>>> 
>>> (2) Table options
>>> We suppose to provide options for creating savepoint periodically:
>>>    savepoint.create-time: When to create the savepoint. Example: 00:00
>>>    savepoint.create-interval: Interval between the creation of two
>> savepoints. Examples: 2 d.
>>>    savepoint.time-retained: The maximum time of savepoints to retain.
>>> 
>>> (3) Procedures (future work)
>>> Spark supports SQL extension. After we support Spark CALL statement, we
>> can provide procedures to create, delete or rollback to savepoint for Spark
>> users.
>>> 
>>> Support of CALL is on the road map of Flink. In future version, we can
>> also support savepoint-related procedures for Flink users.
>>> 
>>> 4. Expiration of data files
>>> 
>>> Currently, when a snapshot is expired, data files that not be used by
>> other snapshots. After we introduce the savepoint, we must make sure the
>> data files saved by savepoint will not be deleted.
>>> 
>>> Conversely,  when a savepoint is deleted, the data files that not be
>> used by existing snapshots and other savepoints will be deleted.
>>> 
>>> I have wrote some POC codes to implement it. I will update the mechanism
>> in PIP soon.
>>> 
>>> Best,
>>> Yu Zelin
>>> 
>>>> 2023年5月21日 20:54，Jingsong Li <[email protected]> 写道：
>>>> 
>>>> Thanks Yun for your information.
>>>> 
>>>> We need to be careful to avoid confusion between Paimon and Flink
>>>> concepts about "savepoint"
>>>> 
>>>> Maybe we don't have to insist on using this "savepoint", for example,
>>>> TAG is also a candidate just like Iceberg [1]
>>>> 
>>>> [1] https://iceberg.apache.org/docs/latest/branching/
>>>> 
>>>> Best,
>>>> Jingsong
>>>> 
>>>> On Sun, May 21, 2023 at 8:51 PM Jingsong Li <[email protected]>
>> wrote:
>>>>> 
>>>>> Thanks Nicholas for your detailed requirements.
>>>>> 
>>>>> We need to supplement user requirements in FLIP, which is mainly aimed
>>>>> at two purposes:
>>>>> 1. Fault recovery for data errors (named: restore or rollback-to)
>>>>> 2. Used to record versions at the day level (such as), targeting
>> batch queries
>>>>> 
>>>>> Best,
>>>>> Jingsong
>>>>> 
>>>>> On Sat, May 20, 2023 at 2:55 PM Yun Tang <[email protected]> wrote:
>>>>>> 
>>>>>> Hi Guys,
>>>>>> 
>>>>>> Since we use Paimon with Flink in most cases, I think we need to
>> identify the same word "savepoint" in different systems.
>>>>>> 
>>>>>> For Flink, savepoint means:
>>>>>> 
>>>>>> 1.  Triggered by users, not periodically triggered by the system
>> itself. However, this FLIP wants to support it created periodically.
>>>>>> 2.  Even the so-called incremental native savepoint [1], it will
>> not depend on the previous checkpoints or savepoints, it will still copy
>> files on DFS to the self-contained savepoint folder. However, from the
>> description of this FLIP about the deletion of expired snapshot files,
>> paimion savepoint will refer to the previously existing files directly.
>>>>>> 
>>>>>> I don't think we need to make the semantics of Paimon totally the
>> same as Flink's. However, we need to introduce a table to tell the
>> difference compared with Flink and discuss about the difference.
>>>>>> 
>>>>>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Semantic
>>>>>> 
>>>>>> Best
>>>>>> Yun Tang
>>>>>> ________________________________
>>>>>> From: Nicholas Jiang <[email protected]>
>>>>>> Sent: Friday, May 19, 2023 17:40
>>>>>> To: [email protected] <[email protected]>
>>>>>> Subject: Re: [DISCUSS] PIP-4 Support savepoint
>>>>>> 
>>>>>> Hi Guys,
>>>>>> 
>>>>>> Thanks Zelin for driving the savepoint proposal. I propose some
>> opinions for savepont:
>>>>>> 
>>>>>> -- About "introduce savepoint for Paimon to persist full data in a
>> time point"
>>>>>> 
>>>>>> The motivation of savepoint proposal is more like snapshot TTL
>> management. Actually, disaster recovery is very much mission critical for
>> any software. Especially when it comes to data systems, the impact could be
>> very serious leading to delay in business decisions or even wrong business
>> decisions at times. Savepoint is proposed to assist users in recovering
>> data from a previous state: "savepoint" and "restore".
>>>>>> 
>>>>>> "savepoint" saves the Paimon table as of the commit time, therefore
>> if there is a savepoint, the data generated in the corresponding commit
>> could not be clean. Meanwhile, savepoint could let user restore the table
>> to this savepoint at a later point in time if need be. On similar lines,
>> savepoint cannot be triggered on a commit that is already cleaned up.
>> Savepoint is synonymous to taking a backup, just that we don't make a new
>> copy of the table, but just save the state of the table elegantly so that
>> we can restore it later when in need.
>>>>>> 
>>>>>> "restore" lets you restore your table to one of the savepoint
>> commit. Meanwhile, it cannot be undone (or reversed) and so care should be
>> taken before doing a restore. At this time, Paimon would delete all data
>> files and commit files (timeline files) greater than the savepoint commit
>> to which the table is being restored.
>>>>>> 
>>>>>> BTW, it's better to introduce snapshot view based on savepoint,
>> which could improve query performance of historical data for Paimon table.
>>>>>> 
>>>>>> -- About Public API of savepont
>>>>>> 
>>>>>> Current introduced savepoint interfaces in Public API are not enough
>> for users, for example, deleteSavepoint, restoreSavepoint etc.
>>>>>> 
>>>>>> -- About "Paimon's savepoint need to be combined with Flink's
>> savepoint":
>>>>>> 
>>>>>> If paimon supports savepoint mechanism and provides savepoint
>> interfaces, the integration with Flink's savepoint is not blocked for this
>> proposal.
>>>>>> 
>>>>>> In summary, savepoint is not only used to improve the query
>> performance of historical data, but also used for disaster recovery
>> processing.
>>>>>> 
>>>>>> On 2023/05/17 09:53:11 Jingsong Li wrote:
>>>>>>> What Shammon mentioned is interesting. I agree with what he said
>> about
>>>>>>> the differences in savepoints between databases and stream
>> computing.
>>>>>>> 
>>>>>>> About "Paimon's savepoint need to be combined with Flink's
>> savepoint":
>>>>>>> 
>>>>>>> I think it is possible, but we may need to deal with this in another
>>>>>>> mechanism, because the snapshots after savepoint may expire. We need
>>>>>>> to compare data between two savepoints to generate incremental data
>>>>>>> for streaming read.
>>>>>>> 
>>>>>>> But this may not need to block FLIP, it looks like the current
>> design
>>>>>>> does not break the future combination?
>>>>>>> 
>>>>>>> Best,
>>>>>>> Jingsong
>>>>>>> 
>>>>>>> On Wed, May 17, 2023 at 5:33 PM Shammon FY <[email protected]>
>> wrote:
>>>>>>>> 
>>>>>>>> Hi Caizhi,
>>>>>>>> 
>>>>>>>> Thanks for your comments. As you mentioned, I think we may need to
>> discuss
>>>>>>>> the role of savepoint in Paimon.
>>>>>>>> 
>>>>>>>> If I understand correctly, the main feature of savepoint in the
>> current PIP
>>>>>>>> is that the savepoint will not be expired, and users can perform a
>> query on
>>>>>>>> the savepoint according to time-travel. Besides that, there is
>> savepoint in
>>>>>>>> the database and Flink.
>>>>>>>> 
>>>>>>>> 1. Savepoint in database. The database can roll back table data to
>> the
>>>>>>>> specified 'version' based on savepoint. So the key point of
>> savepoint in
>>>>>>>> the database is to rollback data.
>>>>>>>> 
>>>>>>>> 2. Savepoint in Flink. Users can trigger a savepoint with a
>> specific
>>>>>>>> 'path', and save all data of state to the savepoint for job. Then
>> users can
>>>>>>>> create a new job based on the savepoint to continue consuming
>> incremental
>>>>>>>> data. I think the core capabilities are: backup for a job, and
>> resume a job
>>>>>>>> based on the savepoint.
>>>>>>>> 
>>>>>>>> In addition to the above, Paimon may also face data write
>> corruption and
>>>>>>>> need to recover data based on the specified savepoint. So we may
>> need to
>>>>>>>> consider what abilities should Paimon savepoint need besides the
>> ones
>>>>>>>> mentioned in the current PIP?
>>>>>>>> 
>>>>>>>> Additionally, as mentioned above, Flink also has
>>>>>>>> savepoint mechanism. During the process of streaming data from
>> Flink to
>>>>>>>> Paimon, does Paimon's savepoint need to be combined with Flink's
>> savepoint?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Shammon FY
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, May 17, 2023 at 4:02 PM Caizhi Weng <[email protected]>
>> wrote:
>>>>>>>> 
>>>>>>>>> Hi developers!
>>>>>>>>> 
>>>>>>>>> Thanks Zelin for bringing up the discussion. The proposal seems
>> good to me
>>>>>>>>> overall. However I'd also like to bring up a few options.
>>>>>>>>> 
>>>>>>>>> 1. As Jingsong mentioned, Savepoint class should not become a
>> public API,
>>>>>>>>> at least for now. What we need to discuss for the public API is
>> how the
>>>>>>>>> users can create or delete savepoints. For example, what the
>> table option
>>>>>>>>> looks like, what commands and options are provided for the Flink
>> action,
>>>>>>>>> etc.
>>>>>>>>> 
>>>>>>>>> 2. Currently most Flink actions are related to streaming
>> processing, so
>>>>>>>>> only Flink can support them. However, savepoint creation and
>> deletion seems
>>>>>>>>> like a feature for batch processing. So aside from Flink actions,
>> shall we
>>>>>>>>> also provide something like Spark actions for savepoints?
>>>>>>>>> 
>>>>>>>>> I would also like to comment on Shammon's views.
>>>>>>>>> 
>>>>>>>>> Should we introduce an option for savepoint path which may be
>> different
>>>>>>>>>> from 'warehouse'? Then users can backup the data of savepoint.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I don't see this is necessary. To backup a table the user just
>> need to copy
>>>>>>>>> all files from the table directory. Savepoint in Paimon, as far
>> as I
>>>>>>>>> understand, is mainly for users to review historical data, not
>> for backing
>>>>>>>>> up tables.
>>>>>>>>> 
>>>>>>>>> Will the savepoint copy data files from snapshot or only save
>> meta files?
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> It would be a heavy burden if a savepoint copies all its files.
>> As I
>>>>>>>>> mentioned above, savepoint is not for backing up tables.
>>>>>>>>> 
>>>>>>>>> How can users create a new table and restore data from the
>> specified
>>>>>>>>>> savepoint?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> This reminds me of savepoints in Flink. Still, savepoint is not
>> for backing
>>>>>>>>> up tables so I guess we don't need to support "restoring data"
>> from a
>>>>>>>>> savepoint.
>>>>>>>>> 
>>>>>>>>> Shammon FY <[email protected]> 于2023年5月17日周三 10:32写道：
>>>>>>>>> 
>>>>>>>>>> Thanks Zelin for initiating this discussion. I have some
>> comments:
>>>>>>>>>> 
>>>>>>>>>> 1. Should we introduce an option for savepoint path which may be
>>>>>>>>> different
>>>>>>>>>> from 'warehouse'? Then users can backup the data of savepoint.
>>>>>>>>>> 
>>>>>>>>>> 2. Will the savepoint copy data files from snapshot or only save
>> meta
>>>>>>>>>> files? The description in the PIP "After we introduce savepoint,
>> we
>>>>>>>>> should
>>>>>>>>>> also check if the data files are used by savepoints." looks like
>> we only
>>>>>>>>>> save meta files for savepoint.
>>>>>>>>>> 
>>>>>>>>>> 3. How can users create a new table and restore data from the
>> specified
>>>>>>>>>> savepoint?
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Shammon FY
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Wed, May 17, 2023 at 10:19 AM Jingsong Li <
>> [email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Thanks Zelin for driving.
>>>>>>>>>>> 
>>>>>>>>>>> Some comments:
>>>>>>>>>>> 
>>>>>>>>>>> 1. I think it's possible to advance `Proposed Changes` to the
>> top,
>>>>>>>>>>> Public API has no meaning if I don't know how to do it.
>>>>>>>>>>> 
>>>>>>>>>>> 2. Public API, Savepoint and SavepointManager are not Public
>> API, only
>>>>>>>>>>> Flink action or configuration option should be public API.
>>>>>>>>>>> 
>>>>>>>>>>> 3.Maybe we can have a separate chapter to describe
>>>>>>>>>>> `savepoint.create-interval`, maybe 'Periodically savepoint'? It
>> is not
>>>>>>>>>>> just an interval, because the true user case is savepoint after
>> 0:00.
>>>>>>>>>>> 
>>>>>>>>>>> 4.About 'Interaction with Snapshot', to be continued ...
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Jingsong
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, May 16, 2023 at 7:07 PM yu zelin <[email protected]
>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi, Paimon Devs,
>>>>>>>>>>>>    I’d like to start a discussion about PIP-4[1]. In this
>> PIP, I
>>>>>>>>> want
>>>>>>>>>>> to talk about why we need savepoint, and some thoughts about
>> managing
>>>>>>>>> and
>>>>>>>>>>> using savepoint. Look forward to your question and suggestions.
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Yu Zelin
>>>>>>>>>>>> 
>>>>>>>>>>>> [1] https://cwiki.apache.org/confluence/x/NxE0Dw
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>> 
>>

Re: [DISCUSS] PIP-4 Support savepoint

Reply via email to