Re: Size control of minot compaction

2020-11-23 Thread Kunal Kapoor
Hi Zhangshunyu, We should refactor the code and change the property name from " carbon.major.compaction.size" to "carbon.compaction.size.threshold"( A global property is exposed which defines the size after which segment would not be considered for auto compaction). By doing this we can use the sam

Re: Size control of minot compaction

2020-11-23 Thread Kunal Kapoor
Hi Zhangshunyu, We should refactor the code and change the property name from " carbon.major.compaction.size" to "carbon.compaction.size.threshold"( A global property is exposed which defines the size after which segment would not be considered for auto compaction). By doing this we can use the sam

Re: Size control of minot compaction

2020-11-23 Thread Zhangshunyu
OK - My English name is Sunday -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Size control of minot compaction

2020-11-23 Thread Ajantha Bhat
Hi Zhangshunyu, Thanks for providing more details on the problem. If it is just for skipping history segments during auto minor compaction, Adding a size threshold for minor compaction should be fine. We can have a table level, dynamically configurable threshold. If it is not configured, consider

Re: Size control of minor compaction

2020-11-23 Thread Zhangshunyu
agree - My English name is Sunday -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Size control of minot compaction

2020-11-23 Thread Zhangshunyu
Yes, we need to support auto load merge for major compaction or size threshold limit for minor compaction. In many cases, the user use the minor compaction only want to merge small segments by time series (the num of segment is generated intime series), they dont want to merge big segment which is

Re: Size control of minot compaction

2020-11-23 Thread Zhangshunyu
hi Ajantha, thanks for this reply. Because many users will enable auto load merge for monir compaction as the segment will be geneated per hour based on time. Sometimes, the user will load some history data manually by load cmd, and the data size of segment for history data will be very large,but

Re: Size control of minot compaction

2020-11-23 Thread akashrn5
Hi Sunday, This looks like a valid scenario because, may be some user application might be doing the minor compaction by default and some may be enabled auto compaction. which basically will be minor and if size is more we blindly go to compact. So i think instead of supporting auto compaction

Re: [DISCUSSION]Merge index property and operations improvement.

2020-11-23 Thread akashrn5
Hi david, Thanks for reply a) remove mergeIndex property and event listener, add mergeIndex as a part of loading/compaction transaction. ==> yes, this can be done, as already discussed. b) if the merging index failed, loading/compaction should fail directly. ==> Agree to this, same as replied t

Re: Size control of minot compaction

2020-11-23 Thread Ajantha Bhat
Hi Zhangshunyu, For this scenario specific cases, the user can use custom compaction by mentioning the segment id which needs to be considered for compaction. Also if you just want to do size based, major compaction can be used. So, why are you thinking to support size based minor compaction? It

Re: Size control of minor compaction

2020-11-23 Thread David CaiQiang
+1 It will task many resources and a long time to compact a large segment, and may not get a good result. Auto compaction is disabled, we could give a large default value(maybe 1024GB), it will not impact the behavior by default. And the table level threshold is needed also. If the user wants t

Re: [ANNOUNCE] Ajantha as new PMC for Apache CarbonData

2020-11-23 Thread David CaiQiang
Congratulations to Ajantha. - Best Regards David Cai -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION]Merge index property and operations improvement.

2020-11-23 Thread David CaiQiang
a) remove mergeIndex property and event listener, add mergeIndex as a part of loading/compaction transaction. b) if the merging index failed, loading/compaction should fail directly. c) keep merge_index command and mark it deprecated. for a new table, maybe it will do nothing.

Re: [DISCUSSION]Merge index property and operations improvement.

2020-11-23 Thread akashrn5
Hi Ajantha, Thanks for the reply, please find my comments *a) and b)* agree to the point that, no need to make load success if the merge index fails, we can fail the load and update the status and segment file only after merge index to avoid many reliability and concurrent and cache issues. *c)