Re: Carbon merge should support update random columns each row

2020-09-10 Thread David CaiQiang
+1

maybe we need to use a delta file to store updated values instead of  the
deletedelta file



-
Best Regards
David Cai
--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Clean files enhancement

2020-09-10 Thread vikramahuja1001
Hi all,
This mail is regarding enhancing the clean files command.
Current behaviour : Currently when clean files is called, the segments which
are MARKED_FOR_DELETE or are COMPACTED are deleted and their entries are
removed from tablestatus file, Fact folder and metadata/segments folder. 

Enhancement behaviour idea: In this enhancement the idea is to create a
trash folder(like Recycle Bin, with 777 config) which can be stored in /tmp
folder(or user defined folder, a new property will be exposed). Here when
ever a segment is cleaned , the necessary carbondata files (no other files)
can be copied to this folder. The RecycleBin folder can have a folder for
each table with name like DBName_TableName. We can keep the carbondata files
here for 3 days(or as long as the user wants, a carbon property will be
exposed for the same.). They can be deleted if they are not modified since 3
days or as per the property. We can maintain a thread which checks the aging
time and deletes the necessary carbondata files from the trash folder. 

Apart from that, while cleaning INSERT_IN_PROGRESS segments will be cleaned
too, but will try to get a segment lock before cleaning the
INSERT_IN_PROGRESS segments. If the code is able to acquire the segment
lock, i.e., it is a stale folder, it can be cleaned. If the code is not able
to acquire the segment lock that means load is in progress or any other
operation is in progress, in that case the INSERT_IN_PROGRESS segment will
not be cleaned.

Please provide input and suggestions for this enhancement idea.

Thanks
Vikram Ahuja



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/