FYI The PIP lacks a table to show Discussion thread & Vote thread & ISSUE...
Best Jingsong On Mon, May 22, 2023 at 4:48 PM yu zelin <[email protected]> wrote: > > Hi, all, > > Thank all of you for your suggestions and questions. After reading your > suggestions, I adopt some of them and I want to share my opinions here. > > To make my statements more clear, I will still use the word `savepoint`. When > we make a consensus, the name may be changed. > > 1. The purposes of savepoint > > As Shammon mentioned, Flink and database also have the concept of > `savepoint`. So it’s better to clarify the purposes of our savepoint. Thanks > for Nicholas and Jingsong, I think your explanations are very clear. I’d like > to give my summary: > > (1) Fault recovery (or we can say disaster recovery). Users can ROLL BACK to > a savepoint if needed. If user rollbacks to a savepoint, the table will hold > the data in the savepoint and the data committed after the savepoint will be > deleted. In this scenario we need savepoint because snapshots may have > expired, the savepoint can keep longer and save user’s old data. > > (2) Record versions of data at a longer interval (typically daily level or > weekly level). With savepoint, user can query the old data in batch mode. > Comparing to copy records to a new table or merge incremental records with > old records (like using merge into in Hive), the savepoint is more > lightweight because we don’t copy data files, we just record the meta data of > them. > > As you can see, savepoint is very similar to snapshot. The differences are: > > (1) Savepoint lives longer. In most cases, snapshot’s life time is about > several minutes to hours. We suppose the savepoint can live several days, > weeks, or even months. > > (2) Savepoint is mainly used for batch reading for historical data. In this > PIP, we don’t introduce streaming reading for savepoint. > > 2. Candidates of name > > I agree with Jingsong that we can use a new name. Since the purpose and > mechanism (savepoint is very similar to snapshot) of savepoint is similar to > `tag` in iceberg, maybe we can use `tag`. > > In my opinion, an alternative is `anchor`. All the snapshots are like the > navigation path of the streaming data, and an `anchor` can stop it in a place. > > 3. Public table operations and options > > We supposed to expose some operations and table options for user to manage > the savepoint. > > (1) Operations (Currently for Flink) > We provide flink actions to manage savepoints: > create-savepoint: To generate a savepoint from latest snapshot. Support > to create from specified snapshot. > delete-savepoint: To delete specified savepoint. > rollback-to: To roll back to a specified savepoint. > > (2) Table options > We suppose to provide options for creating savepoint periodically: > savepoint.create-time: When to create the savepoint. Example: 00:00 > savepoint.create-interval: Interval between the creation of two > savepoints. Examples: 2 d. > savepoint.time-retained: The maximum time of savepoints to retain. > > (3) Procedures (future work) > Spark supports SQL extension. After we support Spark CALL statement, we can > provide procedures to create, delete or rollback to savepoint for Spark users. > > Support of CALL is on the road map of Flink. In future version, we can also > support savepoint-related procedures for Flink users. > > 4. Expiration of data files > > Currently, when a snapshot is expired, data files that not be used by other > snapshots. After we introduce the savepoint, we must make sure the data files > saved by savepoint will not be deleted. > > Conversely, when a savepoint is deleted, the data files that not be used by > existing snapshots and other savepoints will be deleted. > > I have wrote some POC codes to implement it. I will update the mechanism in > PIP soon. > > Best, > Yu Zelin > > > 2023年5月21日 20:54,Jingsong Li <[email protected]> 写道: > > > > Thanks Yun for your information. > > > > We need to be careful to avoid confusion between Paimon and Flink > > concepts about "savepoint" > > > > Maybe we don't have to insist on using this "savepoint", for example, > > TAG is also a candidate just like Iceberg [1] > > > > [1] https://iceberg.apache.org/docs/latest/branching/ > > > > Best, > > Jingsong > > > > On Sun, May 21, 2023 at 8:51 PM Jingsong Li <[email protected]> wrote: > >> > >> Thanks Nicholas for your detailed requirements. > >> > >> We need to supplement user requirements in FLIP, which is mainly aimed > >> at two purposes: > >> 1. Fault recovery for data errors (named: restore or rollback-to) > >> 2. Used to record versions at the day level (such as), targeting batch > >> queries > >> > >> Best, > >> Jingsong > >> > >> On Sat, May 20, 2023 at 2:55 PM Yun Tang <[email protected]> wrote: > >>> > >>> Hi Guys, > >>> > >>> Since we use Paimon with Flink in most cases, I think we need to identify > >>> the same word "savepoint" in different systems. > >>> > >>> For Flink, savepoint means: > >>> > >>> 1. Triggered by users, not periodically triggered by the system itself. > >>> However, this FLIP wants to support it created periodically. > >>> 2. Even the so-called incremental native savepoint [1], it will not > >>> depend on the previous checkpoints or savepoints, it will still copy > >>> files on DFS to the self-contained savepoint folder. However, from the > >>> description of this FLIP about the deletion of expired snapshot files, > >>> paimion savepoint will refer to the previously existing files directly. > >>> > >>> I don't think we need to make the semantics of Paimon totally the same as > >>> Flink's. However, we need to introduce a table to tell the difference > >>> compared with Flink and discuss about the difference. > >>> > >>> [1] > >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Semantic > >>> > >>> Best > >>> Yun Tang > >>> ________________________________ > >>> From: Nicholas Jiang <[email protected]> > >>> Sent: Friday, May 19, 2023 17:40 > >>> To: [email protected] <[email protected]> > >>> Subject: Re: [DISCUSS] PIP-4 Support savepoint > >>> > >>> Hi Guys, > >>> > >>> Thanks Zelin for driving the savepoint proposal. I propose some opinions > >>> for savepont: > >>> > >>> -- About "introduce savepoint for Paimon to persist full data in a time > >>> point" > >>> > >>> The motivation of savepoint proposal is more like snapshot TTL > >>> management. Actually, disaster recovery is very much mission critical for > >>> any software. Especially when it comes to data systems, the impact could > >>> be very serious leading to delay in business decisions or even wrong > >>> business decisions at times. Savepoint is proposed to assist users in > >>> recovering data from a previous state: "savepoint" and "restore". > >>> > >>> "savepoint" saves the Paimon table as of the commit time, therefore if > >>> there is a savepoint, the data generated in the corresponding commit > >>> could not be clean. Meanwhile, savepoint could let user restore the table > >>> to this savepoint at a later point in time if need be. On similar lines, > >>> savepoint cannot be triggered on a commit that is already cleaned up. > >>> Savepoint is synonymous to taking a backup, just that we don't make a new > >>> copy of the table, but just save the state of the table elegantly so that > >>> we can restore it later when in need. > >>> > >>> "restore" lets you restore your table to one of the savepoint commit. > >>> Meanwhile, it cannot be undone (or reversed) and so care should be taken > >>> before doing a restore. At this time, Paimon would delete all data files > >>> and commit files (timeline files) greater than the savepoint commit to > >>> which the table is being restored. > >>> > >>> BTW, it's better to introduce snapshot view based on savepoint, which > >>> could improve query performance of historical data for Paimon table. > >>> > >>> -- About Public API of savepont > >>> > >>> Current introduced savepoint interfaces in Public API are not enough for > >>> users, for example, deleteSavepoint, restoreSavepoint etc. > >>> > >>> -- About "Paimon's savepoint need to be combined with Flink's savepoint": > >>> > >>> If paimon supports savepoint mechanism and provides savepoint interfaces, > >>> the integration with Flink's savepoint is not blocked for this proposal. > >>> > >>> In summary, savepoint is not only used to improve the query performance > >>> of historical data, but also used for disaster recovery processing. > >>> > >>> On 2023/05/17 09:53:11 Jingsong Li wrote: > >>>> What Shammon mentioned is interesting. I agree with what he said about > >>>> the differences in savepoints between databases and stream computing. > >>>> > >>>> About "Paimon's savepoint need to be combined with Flink's savepoint": > >>>> > >>>> I think it is possible, but we may need to deal with this in another > >>>> mechanism, because the snapshots after savepoint may expire. We need > >>>> to compare data between two savepoints to generate incremental data > >>>> for streaming read. > >>>> > >>>> But this may not need to block FLIP, it looks like the current design > >>>> does not break the future combination? > >>>> > >>>> Best, > >>>> Jingsong > >>>> > >>>> On Wed, May 17, 2023 at 5:33 PM Shammon FY <[email protected]> wrote: > >>>>> > >>>>> Hi Caizhi, > >>>>> > >>>>> Thanks for your comments. As you mentioned, I think we may need to > >>>>> discuss > >>>>> the role of savepoint in Paimon. > >>>>> > >>>>> If I understand correctly, the main feature of savepoint in the current > >>>>> PIP > >>>>> is that the savepoint will not be expired, and users can perform a > >>>>> query on > >>>>> the savepoint according to time-travel. Besides that, there is > >>>>> savepoint in > >>>>> the database and Flink. > >>>>> > >>>>> 1. Savepoint in database. The database can roll back table data to the > >>>>> specified 'version' based on savepoint. So the key point of savepoint in > >>>>> the database is to rollback data. > >>>>> > >>>>> 2. Savepoint in Flink. Users can trigger a savepoint with a specific > >>>>> 'path', and save all data of state to the savepoint for job. Then users > >>>>> can > >>>>> create a new job based on the savepoint to continue consuming > >>>>> incremental > >>>>> data. I think the core capabilities are: backup for a job, and resume a > >>>>> job > >>>>> based on the savepoint. > >>>>> > >>>>> In addition to the above, Paimon may also face data write corruption and > >>>>> need to recover data based on the specified savepoint. So we may need to > >>>>> consider what abilities should Paimon savepoint need besides the ones > >>>>> mentioned in the current PIP? > >>>>> > >>>>> Additionally, as mentioned above, Flink also has > >>>>> savepoint mechanism. During the process of streaming data from Flink to > >>>>> Paimon, does Paimon's savepoint need to be combined with Flink's > >>>>> savepoint? > >>>>> > >>>>> > >>>>> Best, > >>>>> Shammon FY > >>>>> > >>>>> > >>>>> On Wed, May 17, 2023 at 4:02 PM Caizhi Weng <[email protected]> > >>>>> wrote: > >>>>> > >>>>>> Hi developers! > >>>>>> > >>>>>> Thanks Zelin for bringing up the discussion. The proposal seems good > >>>>>> to me > >>>>>> overall. However I'd also like to bring up a few options. > >>>>>> > >>>>>> 1. As Jingsong mentioned, Savepoint class should not become a public > >>>>>> API, > >>>>>> at least for now. What we need to discuss for the public API is how the > >>>>>> users can create or delete savepoints. For example, what the table > >>>>>> option > >>>>>> looks like, what commands and options are provided for the Flink > >>>>>> action, > >>>>>> etc. > >>>>>> > >>>>>> 2. Currently most Flink actions are related to streaming processing, so > >>>>>> only Flink can support them. However, savepoint creation and deletion > >>>>>> seems > >>>>>> like a feature for batch processing. So aside from Flink actions, > >>>>>> shall we > >>>>>> also provide something like Spark actions for savepoints? > >>>>>> > >>>>>> I would also like to comment on Shammon's views. > >>>>>> > >>>>>> Should we introduce an option for savepoint path which may be different > >>>>>>> from 'warehouse'? Then users can backup the data of savepoint. > >>>>>>> > >>>>>> > >>>>>> I don't see this is necessary. To backup a table the user just need to > >>>>>> copy > >>>>>> all files from the table directory. Savepoint in Paimon, as far as I > >>>>>> understand, is mainly for users to review historical data, not for > >>>>>> backing > >>>>>> up tables. > >>>>>> > >>>>>> Will the savepoint copy data files from snapshot or only save meta > >>>>>> files? > >>>>>>> > >>>>>> > >>>>>> It would be a heavy burden if a savepoint copies all its files. As I > >>>>>> mentioned above, savepoint is not for backing up tables. > >>>>>> > >>>>>> How can users create a new table and restore data from the specified > >>>>>>> savepoint? > >>>>>> > >>>>>> > >>>>>> This reminds me of savepoints in Flink. Still, savepoint is not for > >>>>>> backing > >>>>>> up tables so I guess we don't need to support "restoring data" from a > >>>>>> savepoint. > >>>>>> > >>>>>> Shammon FY <[email protected]> 于2023年5月17日周三 10:32写道: > >>>>>> > >>>>>>> Thanks Zelin for initiating this discussion. I have some comments: > >>>>>>> > >>>>>>> 1. Should we introduce an option for savepoint path which may be > >>>>>> different > >>>>>>> from 'warehouse'? Then users can backup the data of savepoint. > >>>>>>> > >>>>>>> 2. Will the savepoint copy data files from snapshot or only save meta > >>>>>>> files? The description in the PIP "After we introduce savepoint, we > >>>>>> should > >>>>>>> also check if the data files are used by savepoints." looks like we > >>>>>>> only > >>>>>>> save meta files for savepoint. > >>>>>>> > >>>>>>> 3. How can users create a new table and restore data from the > >>>>>>> specified > >>>>>>> savepoint? > >>>>>>> > >>>>>>> Best, > >>>>>>> Shammon FY > >>>>>>> > >>>>>>> > >>>>>>> On Wed, May 17, 2023 at 10:19 AM Jingsong Li <[email protected]> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Thanks Zelin for driving. > >>>>>>>> > >>>>>>>> Some comments: > >>>>>>>> > >>>>>>>> 1. I think it's possible to advance `Proposed Changes` to the top, > >>>>>>>> Public API has no meaning if I don't know how to do it. > >>>>>>>> > >>>>>>>> 2. Public API, Savepoint and SavepointManager are not Public API, > >>>>>>>> only > >>>>>>>> Flink action or configuration option should be public API. > >>>>>>>> > >>>>>>>> 3.Maybe we can have a separate chapter to describe > >>>>>>>> `savepoint.create-interval`, maybe 'Periodically savepoint'? It is > >>>>>>>> not > >>>>>>>> just an interval, because the true user case is savepoint after 0:00. > >>>>>>>> > >>>>>>>> 4.About 'Interaction with Snapshot', to be continued ... > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Jingsong > >>>>>>>> > >>>>>>>> On Tue, May 16, 2023 at 7:07 PM yu zelin <[email protected]> > >>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Hi, Paimon Devs, > >>>>>>>>> I’d like to start a discussion about PIP-4[1]. In this PIP, I > >>>>>> want > >>>>>>>> to talk about why we need savepoint, and some thoughts about managing > >>>>>> and > >>>>>>>> using savepoint. Look forward to your question and suggestions. > >>>>>>>>> > >>>>>>>>> Best, > >>>>>>>>> Yu Zelin > >>>>>>>>> > >>>>>>>>> [1] https://cwiki.apache.org/confluence/x/NxE0Dw > >>>>>>>> > >>>>>>> > >>>>>> > >>>> >
