Hi all, PFA the design document. Please provide suggestions or feedback Vikram Ahuja
On Mon, Sep 28, 2020 at 12:23 PM Vikram Ahuja <vikramahuja8...@gmail.com> wrote: > Thanks for the suggestion Ravi. > > We can include a property in the clean files command which can decide if > we want to dry run. > clean files on table t1 options('dry_run' = true) --> This will only show > the segments which will be removed and will not clean/delete those segments > or any data for that matter. > > By default, the dry_run will be set as false and the user can configure it > when they want to use it. > > Rgds, > Vikram > > On Mon, Sep 28, 2020 at 11:57 AM Akash r <akashr...@gmail.com> wrote: > >> +1 for ravi's comment. It's better, clean and safe. >> >> Regards, >> Akash R Nilugal >> >> On Thu, Sep 24, 2020, 8:34 PM Ravindra Pesala <ravi.pes...@gmail.com> >> wrote: >> >> > Hi Vikram, >> > >> > +1 >> > >> > It is good to remove the automatic cleanup. >> > But I am still worried about the clean file command executed by user as >> > well. We need to enhance the clean file command to introduce dry run to >> > print what segments it is going to be deleted and what is left. If user >> ok >> > with dry run result then he can go for actual run. >> > >> > Regards, >> > Ravindra. >> > >> > On Mon, 21 Sep 2020 at 1:27 PM, Vikram Ahuja <vikramahuja8...@gmail.com >> > >> > wrote: >> > >> > > Hi Ravi and David, >> > > >> > > >> > > >> > > 1. All the automatic clean data in the case of >> load/insert/compact/delete >> > > >> > > will be removed, so cleaning will only happen when the clean files >> > command >> > > >> > > is called. >> > > >> > > >> > > >> > > 2. We will only add the data to trash when we try to clean data which >> is >> > in >> > > >> > > IN PROGRESS state. In case of COmpacted/Marked For Delete it will not >> be >> > > >> > > moved to the trash, it will be directly deleted. The user will only be >> > able >> > > >> > > to recover the In Progress segments if the user wants. @Ravi -> Is >> this >> > > >> > > okay for trash usage? Only using it for in progress segments. >> > > >> > > >> > > >> > > 3. No trash management will be implemented, the data will ONLY BE >> REMOVED >> > > >> > > from the trash folder immediately when the clean files command is >> called. >> > > >> > > There will be no time to live, the data can be kept in the trash >> folder >> > > >> > > untill the user triggers clean files command. >> > > >> > > >> > > >> > > Let me know if you have any questions. >> > > >> > > >> > > >> > > Vikram Ahuja >> > > >> > > >> > > >> > > On Fri, Sep 18, 2020 at 1:43 PM David CaiQiang <david.c...@gmail.com> >> > > wrote: >> > > >> > > >> > > >> > > > agree with Ravindra, >> > > >> > > > >> > > >> > > > 1. stop all automatic clean data in >> > load/insert/compact/update/delete... >> > > >> > > > >> > > >> > > > 2. when clean files command clean in-progress or uncertain data, we >> can >> > > >> > > > move >> > > >> > > > them to data trash. >> > > >> > > > it can prevent delete useful data by mistake, we already find >> this >> > > >> > > > issue >> > > >> > > > in some scenes. >> > > >> > > > other cases(for example clean mark_for_delete/compacted segment) >> > > should >> > > >> > > > not use the data trash folder, clean data directly. >> > > >> > > > >> > > >> > > > 3. no need data trash management, suggest keeping it simple. >> > > >> > > > The clean file command should support empty trash immediately, >> it >> > > will >> > > >> > > > be enough. >> > > >> > > > >> > > >> > > > >> > > >> > > > >> > > >> > > > ----- >> > > >> > > > Best Regards >> > > >> > > > David Cai >> > > >> > > > -- >> > > >> > > > Sent from: >> > > >> > > > >> > >> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ >> > > >> > > > >> > > >> > > >> > >> > -- >> > Thanks & Regards, >> > Ravi >> > >> >