Re: [Discussion]Do we still need to support carbon.merge.index.in.segment property ?

David CaiQiang Thu, 09 Jul 2020 19:53:33 -0700

update reply: 

The merging index should be a part of loading. It is not good to extract the
merging index to an independent process, it brought the query issue (the
system can't find the index files when/after merging).


In my opinion, during loading, new  .carbonindex files should be temporary,
we should merge them to a .carbonindexmerge file in a segment before
updating the segment status to success in tablestatus file.
When the merging index failed, loading should be failed.

for query:
1.  support reading .carbonindex files and .carbonindexmerge files

for loading: (also include the loading part of compaction/create
index/create mv/merge operations)
better to do like this.
step 1. update tablestatus file to add an in-progress segment
step 2. generate carbondata files and temporary .carbonindex files.
step 3. merge .carbonindex files to a .carbonindexmerge file.
step 4. write a segment file. 
step 5. update tablestatus file with final status, segment file name and
some statistics.

So in total,
 update tablestatus file twice,
 write segment file once,
 write .carbonindexmerge files once,
 write and delete .carbonindex files once.

for updating:
1. Now only updating operation can keep .carbonindex file
in the future, maybe we can change updating operations to the same with
merge operations to generate new files into a new segment.



-----
Best Regards
David Cai
--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion]Do we still need to support carbon.merge.index.in.segment property ?

Reply via email to