Re: [DISCUSS] PIP-4 Support savepoint

Jingsong Li Mon, 22 May 2023 06:13:47 -0700

FYI

The PIP lacks a table to show Discussion thread & Vote thread & ISSUE...


Best
Jingsong

On Mon, May 22, 2023 at 4:48 PM yu zelin <[email protected]> wrote:
>
> Hi, all,
>
> Thank all of you for your suggestions and questions. After reading your 
> suggestions, I adopt some of them and I want to share my opinions here.
>
> To make my statements more clear, I will still use the word `savepoint`. When 
> we make a consensus, the name may be changed.
>
> 1. The purposes of savepoint
>
> As Shammon mentioned, Flink and database also have the concept of 
> `savepoint`. So it’s better to clarify the purposes of our savepoint. Thanks 
> for Nicholas and Jingsong, I think your explanations are very clear. I’d like 
> to give my summary:
>
> (1) Fault recovery (or we can say disaster recovery). Users can ROLL BACK to 
> a savepoint if needed. If user rollbacks to a savepoint, the table will hold 
> the data in the savepoint and the data committed  after the savepoint will be 
> deleted. In this scenario we need savepoint because snapshots may have 
> expired, the savepoint can keep longer and save user’s old data.
>
> (2) Record versions of data at a longer interval (typically daily level or 
> weekly level). With savepoint, user can query the old data in batch mode. 
> Comparing to copy records to a new table or merge incremental records with 
> old records (like using merge into in Hive), the savepoint is more 
> lightweight because we don’t copy data files, we just record the meta data of 
> them.
>
> As you can see, savepoint is very similar to snapshot. The differences are:
>
>  (1) Savepoint lives longer. In most cases, snapshot’s life time is about 
> several minutes to hours. We suppose the savepoint can live several days, 
> weeks, or even months.
>
> (2) Savepoint is mainly used for batch reading for historical data. In this 
> PIP, we don’t introduce streaming reading for savepoint.
>
> 2. Candidates of name
>
> I agree with Jingsong that we can use a new name. Since the purpose and 
> mechanism (savepoint is very similar to snapshot) of savepoint is similar to 
> `tag` in iceberg, maybe we can use `tag`.
>
> In my opinion, an alternative is `anchor`. All the snapshots are like the 
> navigation path of the streaming data, and an `anchor` can stop it in a place.
>
> 3. Public table operations and options
>
> We supposed to expose some operations and table options for user to manage 
> the savepoint.
>
> (1) Operations (Currently for Flink)
> We provide flink actions to manage savepoints:
>     create-savepoint: To generate a savepoint from latest snapshot. Support 
> to create from specified snapshot.
>     delete-savepoint: To delete specified savepoint.
>     rollback-to: To roll back to a specified savepoint.
>
> (2) Table options
> We suppose to provide options for creating savepoint periodically:
>     savepoint.create-time: When to create the savepoint. Example: 00:00
>     savepoint.create-interval: Interval between the creation of two 
> savepoints. Examples: 2 d.
>     savepoint.time-retained: The maximum time of savepoints to retain.
>
> (3) Procedures (future work)
> Spark supports SQL extension. After we support Spark CALL statement, we can 
> provide procedures to create, delete or rollback to savepoint for Spark users.
>
> Support of CALL is on the road map of Flink. In future version, we can also 
> support savepoint-related procedures for Flink users.
>
> 4. Expiration of data files
>
> Currently, when a snapshot is expired, data files that not be used by other 
> snapshots. After we introduce the savepoint, we must make sure the data files 
> saved by savepoint will not be deleted.
>
> Conversely,  when a savepoint is deleted, the data files that not be used by 
> existing snapshots and other savepoints will be deleted.
>
> I have wrote some POC codes to implement it. I will update the mechanism in 
> PIP soon.
>
> Best,
> Yu Zelin
>
> > 2023年5月21日 20:54，Jingsong Li <[email protected]> 写道：
> >
> > Thanks Yun for your information.
> >
> > We need to be careful to avoid confusion between Paimon and Flink
> > concepts about "savepoint"
> >
> > Maybe we don't have to insist on using this "savepoint", for example,
> > TAG is also a candidate just like Iceberg [1]
> >
> > [1] https://iceberg.apache.org/docs/latest/branching/
> >
> > Best,
> > Jingsong
> >
> > On Sun, May 21, 2023 at 8:51 PM Jingsong Li <[email protected]> wrote:
> >>
> >> Thanks Nicholas for your detailed requirements.
> >>
> >> We need to supplement user requirements in FLIP, which is mainly aimed
> >> at two purposes:
> >> 1. Fault recovery for data errors (named: restore or rollback-to)
> >> 2. Used to record versions at the day level (such as), targeting batch 
> >> queries
> >>
> >> Best,
> >> Jingsong
> >>
> >> On Sat, May 20, 2023 at 2:55 PM Yun Tang <[email protected]> wrote:
> >>>
> >>> Hi Guys,
> >>>
> >>> Since we use Paimon with Flink in most cases, I think we need to identify 
> >>> the same word "savepoint" in different systems.
> >>>
> >>> For Flink, savepoint means:
> >>>
> >>>  1.  Triggered by users, not periodically triggered by the system itself. 
> >>> However, this FLIP wants to support it created periodically.
> >>>  2.  Even the so-called incremental native savepoint [1], it will not 
> >>> depend on the previous checkpoints or savepoints, it will still copy 
> >>> files on DFS to the self-contained savepoint folder. However, from the 
> >>> description of this FLIP about the deletion of expired snapshot files, 
> >>> paimion savepoint will refer to the previously existing files directly.
> >>>
> >>> I don't think we need to make the semantics of Paimon totally the same as 
> >>> Flink's. However, we need to introduce a table to tell the difference 
> >>> compared with Flink and discuss about the difference.
> >>>
> >>> [1] 
> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Semantic
> >>>
> >>> Best
> >>> Yun Tang
> >>> ________________________________
> >>> From: Nicholas Jiang <[email protected]>
> >>> Sent: Friday, May 19, 2023 17:40
> >>> To: [email protected] <[email protected]>
> >>> Subject: Re: [DISCUSS] PIP-4 Support savepoint
> >>>
> >>> Hi Guys,
> >>>
> >>> Thanks Zelin for driving the savepoint proposal. I propose some opinions 
> >>> for savepont:
> >>>
> >>> -- About "introduce savepoint for Paimon to persist full data in a time 
> >>> point"
> >>>
> >>> The motivation of savepoint proposal is more like snapshot TTL 
> >>> management. Actually, disaster recovery is very much mission critical for 
> >>> any software. Especially when it comes to data systems, the impact could 
> >>> be very serious leading to delay in business decisions or even wrong 
> >>> business decisions at times. Savepoint is proposed to assist users in 
> >>> recovering data from a previous state: "savepoint" and "restore".
> >>>
> >>> "savepoint" saves the Paimon table as of the commit time, therefore if 
> >>> there is a savepoint, the data generated in the corresponding commit 
> >>> could not be clean. Meanwhile, savepoint could let user restore the table 
> >>> to this savepoint at a later point in time if need be. On similar lines, 
> >>> savepoint cannot be triggered on a commit that is already cleaned up. 
> >>> Savepoint is synonymous to taking a backup, just that we don't make a new 
> >>> copy of the table, but just save the state of the table elegantly so that 
> >>> we can restore it later when in need.
> >>>
> >>> "restore" lets you restore your table to one of the savepoint commit. 
> >>> Meanwhile, it cannot be undone (or reversed) and so care should be taken 
> >>> before doing a restore. At this time, Paimon would delete all data files 
> >>> and commit files (timeline files) greater than the savepoint commit to 
> >>> which the table is being restored.
> >>>
> >>> BTW, it's better to introduce snapshot view based on savepoint, which 
> >>> could improve query performance of historical data for Paimon table.
> >>>
> >>> -- About Public API of savepont
> >>>
> >>> Current introduced savepoint interfaces in Public API are not enough for 
> >>> users, for example, deleteSavepoint, restoreSavepoint etc.
> >>>
> >>> -- About "Paimon's savepoint need to be combined with Flink's savepoint":
> >>>
> >>> If paimon supports savepoint mechanism and provides savepoint interfaces, 
> >>> the integration with Flink's savepoint is not blocked for this proposal.
> >>>
> >>> In summary, savepoint is not only used to improve the query performance 
> >>> of historical data, but also used for disaster recovery processing.
> >>>
> >>> On 2023/05/17 09:53:11 Jingsong Li wrote:
> >>>> What Shammon mentioned is interesting. I agree with what he said about
> >>>> the differences in savepoints between databases and stream computing.
> >>>>
> >>>> About "Paimon's savepoint need to be combined with Flink's savepoint":
> >>>>
> >>>> I think it is possible, but we may need to deal with this in another
> >>>> mechanism, because the snapshots after savepoint may expire. We need
> >>>> to compare data between two savepoints to generate incremental data
> >>>> for streaming read.
> >>>>
> >>>> But this may not need to block FLIP, it looks like the current design
> >>>> does not break the future combination?
> >>>>
> >>>> Best,
> >>>> Jingsong
> >>>>
> >>>> On Wed, May 17, 2023 at 5:33 PM Shammon FY <[email protected]> wrote:
> >>>>>
> >>>>> Hi Caizhi,
> >>>>>
> >>>>> Thanks for your comments. As you mentioned, I think we may need to 
> >>>>> discuss
> >>>>> the role of savepoint in Paimon.
> >>>>>
> >>>>> If I understand correctly, the main feature of savepoint in the current 
> >>>>> PIP
> >>>>> is that the savepoint will not be expired, and users can perform a 
> >>>>> query on
> >>>>> the savepoint according to time-travel. Besides that, there is 
> >>>>> savepoint in
> >>>>> the database and Flink.
> >>>>>
> >>>>> 1. Savepoint in database. The database can roll back table data to the
> >>>>> specified 'version' based on savepoint. So the key point of savepoint in
> >>>>> the database is to rollback data.
> >>>>>
> >>>>> 2. Savepoint in Flink. Users can trigger a savepoint with a specific
> >>>>> 'path', and save all data of state to the savepoint for job. Then users 
> >>>>> can
> >>>>> create a new job based on the savepoint to continue consuming 
> >>>>> incremental
> >>>>> data. I think the core capabilities are: backup for a job, and resume a 
> >>>>> job
> >>>>> based on the savepoint.
> >>>>>
> >>>>> In addition to the above, Paimon may also face data write corruption and
> >>>>> need to recover data based on the specified savepoint. So we may need to
> >>>>> consider what abilities should Paimon savepoint need besides the ones
> >>>>> mentioned in the current PIP?
> >>>>>
> >>>>> Additionally, as mentioned above, Flink also has
> >>>>> savepoint mechanism. During the process of streaming data from Flink to
> >>>>> Paimon, does Paimon's savepoint need to be combined with Flink's 
> >>>>> savepoint?
> >>>>>
> >>>>>
> >>>>> Best,
> >>>>> Shammon FY
> >>>>>
> >>>>>
> >>>>> On Wed, May 17, 2023 at 4:02 PM Caizhi Weng <[email protected]> 
> >>>>> wrote:
> >>>>>
> >>>>>> Hi developers!
> >>>>>>
> >>>>>> Thanks Zelin for bringing up the discussion. The proposal seems good 
> >>>>>> to me
> >>>>>> overall. However I'd also like to bring up a few options.
> >>>>>>
> >>>>>> 1. As Jingsong mentioned, Savepoint class should not become a public 
> >>>>>> API,
> >>>>>> at least for now. What we need to discuss for the public API is how the
> >>>>>> users can create or delete savepoints. For example, what the table 
> >>>>>> option
> >>>>>> looks like, what commands and options are provided for the Flink 
> >>>>>> action,
> >>>>>> etc.
> >>>>>>
> >>>>>> 2. Currently most Flink actions are related to streaming processing, so
> >>>>>> only Flink can support them. However, savepoint creation and deletion 
> >>>>>> seems
> >>>>>> like a feature for batch processing. So aside from Flink actions, 
> >>>>>> shall we
> >>>>>> also provide something like Spark actions for savepoints?
> >>>>>>
> >>>>>> I would also like to comment on Shammon's views.
> >>>>>>
> >>>>>> Should we introduce an option for savepoint path which may be different
> >>>>>>> from 'warehouse'? Then users can backup the data of savepoint.
> >>>>>>>
> >>>>>>
> >>>>>> I don't see this is necessary. To backup a table the user just need to 
> >>>>>> copy
> >>>>>> all files from the table directory. Savepoint in Paimon, as far as I
> >>>>>> understand, is mainly for users to review historical data, not for 
> >>>>>> backing
> >>>>>> up tables.
> >>>>>>
> >>>>>> Will the savepoint copy data files from snapshot or only save meta 
> >>>>>> files?
> >>>>>>>
> >>>>>>
> >>>>>> It would be a heavy burden if a savepoint copies all its files. As I
> >>>>>> mentioned above, savepoint is not for backing up tables.
> >>>>>>
> >>>>>> How can users create a new table and restore data from the specified
> >>>>>>> savepoint?
> >>>>>>
> >>>>>>
> >>>>>> This reminds me of savepoints in Flink. Still, savepoint is not for 
> >>>>>> backing
> >>>>>> up tables so I guess we don't need to support "restoring data" from a
> >>>>>> savepoint.
> >>>>>>
> >>>>>> Shammon FY <[email protected]> 于2023年5月17日周三 10:32写道：
> >>>>>>
> >>>>>>> Thanks Zelin for initiating this discussion. I have some comments:
> >>>>>>>
> >>>>>>> 1. Should we introduce an option for savepoint path which may be
> >>>>>> different
> >>>>>>> from 'warehouse'? Then users can backup the data of savepoint.
> >>>>>>>
> >>>>>>> 2. Will the savepoint copy data files from snapshot or only save meta
> >>>>>>> files? The description in the PIP "After we introduce savepoint, we
> >>>>>> should
> >>>>>>> also check if the data files are used by savepoints." looks like we 
> >>>>>>> only
> >>>>>>> save meta files for savepoint.
> >>>>>>>
> >>>>>>> 3. How can users create a new table and restore data from the 
> >>>>>>> specified
> >>>>>>> savepoint?
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Shammon FY
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, May 17, 2023 at 10:19 AM Jingsong Li <[email protected]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Thanks Zelin for driving.
> >>>>>>>>
> >>>>>>>> Some comments:
> >>>>>>>>
> >>>>>>>> 1. I think it's possible to advance `Proposed Changes` to the top,
> >>>>>>>> Public API has no meaning if I don't know how to do it.
> >>>>>>>>
> >>>>>>>> 2. Public API, Savepoint and SavepointManager are not Public API, 
> >>>>>>>> only
> >>>>>>>> Flink action or configuration option should be public API.
> >>>>>>>>
> >>>>>>>> 3.Maybe we can have a separate chapter to describe
> >>>>>>>> `savepoint.create-interval`, maybe 'Periodically savepoint'? It is 
> >>>>>>>> not
> >>>>>>>> just an interval, because the true user case is savepoint after 0:00.
> >>>>>>>>
> >>>>>>>> 4.About 'Interaction with Snapshot', to be continued ...
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Jingsong
> >>>>>>>>
> >>>>>>>> On Tue, May 16, 2023 at 7:07 PM yu zelin <[email protected]>
> >>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi, Paimon Devs,
> >>>>>>>>>     I’d like to start a discussion about PIP-4[1]. In this PIP, I
> >>>>>> want
> >>>>>>>> to talk about why we need savepoint, and some thoughts about managing
> >>>>>> and
> >>>>>>>> using savepoint. Look forward to your question and suggestions.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Yu Zelin
> >>>>>>>>>
> >>>>>>>>> [1] https://cwiki.apache.org/confluence/x/NxE0Dw
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
>

Re: [DISCUSS] PIP-4 Support savepoint

Reply via email to