Hi devs.
I have updated pip-18.
please review it and let us discuss.

https://cwiki.apache.org/confluence/display/PAIMON/PIP-18%3A+Introduce+clone+Procedure

On Sun, Apr 7, 2024 at 4:53 PM Jingsong Li <[email protected]> wrote:
>
> > We must ensure that there is at least one complete snapshot in the target 
> > table after the clone procedure is finished.
>
> +1
>
> > We need to discuss whether it is possible for the following corner case to 
> > occur.
>
> This example is indeed a problem. We can check if there is a complete
> snapshot, but it can also be left out of consideration for now. It
> should be improved in the future.
>
> Best,
> Jingsong
>
> On Wed, Apr 3, 2024 at 4:40 PM wj wang <[email protected]> wrote:
> >
> > Hi,Jingsong and zelin,
> >
> > My opinion is as follows:
> > We must ensure that there is at least one complete snapshot in the target
> > table after the clone procedure is finished.
> >
> > > For cloning specified snapshot or tag, undoubtedly, rollback operation
> > (deleting copied files) and an exception should be thrown.
> >
> > My opinion is exactly the same as Jingsong.
> >
> >
> >
> > >  For cloning all snapshots and tags, we should ignore deleted files to
> > keep this clone working. To avoid conflicting with expiring snapshots and
> > deleting files in streaming writing job.
> >
> > We need to discuss whether it is possible for the following corner case to
> > occur.
> >     1. There are three snapshots(snapshot-1, snapshot-2, snapshot-3) at the
> > beginning of the source table.
> >     2. We start a clone procedure. All files belonging to
> > snapshots(snapshot-1, snapshot-2, snapshot-3) are selected.
> >     3. Start a flink batch job to copy files.
> >     4. In streaming writing job, commit snapshot-4, snapshot-5, snapshot-6.
> >     5. The snapshot-3 hit the snapshot expire logic and some files of
> > snapshot-3 are deleted.
> >     6. The flink batch job was executed for a long time due to cluster
> > environment and other factors. Now it finished and ignore FileNotFound
> > exception.
> >     7. Finally there is no complete snapshot in the target table.
> > Whether it is possible for the corner case to occur? Let discuss it.
> >
> >
> >
> > On Wed, Apr 3, 2024 at 3:13 PM Jingsong Li <[email protected]> wrote:
> >
> > > > I want to know that if in the clone procedure, the specified snapshot or
> > > tag is being deleted, how do we handle the exception?
> > > Should we stop the procedure and clean the temporary target table
> > > directory?
> > >
> > > - For cloning specified snapshot or tag, undoubtedly, rollback
> > > operation (deleting copied files) and an exception should be thrown.
> > >
> > > - For cloning all snapshots and tags, we should ignore deleted files
> > > to keep this clone working. To avoid conflicting with expiring
> > > snapshots and deleting files in streaming writing job.
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Wed, Apr 3, 2024 at 3:08 PM yu zelin <[email protected]> wrote:
> > > >
> > > > Hi Jingsong,
> > > >
> > > > I want to know that if in the clone procedure, the specified snapshot or
> > > > tag is being deleted, how do we handle the exception?
> > > > Should we stop the procedure and clean the temporary target table
> > > directory?
> > > >
> > > > Best regards,
> > > > Zelin Yu
> > > >
> > > > On Mon, Mar 18, 2024 at 1:30 PM Jingsong Li <[email protected]>
> > > wrote:
> > > >
> > > > > Hi devs,
> > > > >
> > > > > I have heard many times that there is a need to copy the entire table,
> > > > > and my advice to them is often to use file system file copying.
> > > > >
> > > > > But there are a few issues:
> > > > > 1. It is necessary to copy a large number of files, and it is likely
> > > > > that some files will be deleted due to ongoing work, resulting in
> > > > > copying failure.
> > > > > 2. The target table may need to synchronize Hive metadata, which means
> > > > > using HiveCatalog, which cannot be solved by copying files.
> > > > >
> > > > > So I suggest we have a clone procedure. [1]
> > > > >
> > > > > Also, welcome contributors to develop this PIP together, and I will
> > > > > help you review your code.
> > > > >
> > > > > [1]
> > > > >
> > > https://cwiki.apache.org/confluence/display/PAIMON/PIP-18%3A+Introduce+clone+Procedure
> > > > >
> > > > > Best,
> > > > > Jingsong
> > > > >
> > >

Reply via email to