Enhancement on compaction performance

2018-11-07 Thread xuchuanyin
Hi all:
I am raising a PR to enhance the performance of compaction. The PR number is 
#2906.

Based on my experiments using about 72GB LineItem data ( in 100GB TPCH data), I 
got the following results.

Code Branch PrefetchBatch Size (default 100)Load1 (s)   
Load2 (s)   Load3 (s)   Compact 3 Loads (s) Time Reduced
master  NA  100 447.4   445.9   450.1   661.3   Base Line
master  NA  32000   441.5   454.4   456.8   641.2   +3.0%
PR2906  enable  100 445.3   450.2   445.3   411.8   +37.7%
PR2906  enable  32000   438.7   446.8   441.8   333.1   +49.6%
PR2906  disable 100 458.1   459.4   450.9   659.5   +0.3%
PR2906  disable 32000   472.0   446.8   457.1   654.5   +1.0%
Note: These tests are under spark-2.2 version

The results show that compaction performance is almost doubled if configured 
properly.
It also shows even if this feature is disabled, the compaction performance 
still not decrease.

So here:

1. I do want to make this feature ‘enabled’ by default.

2. Besides, I’d want the others in the community also test this feature and 
check whether we can benefit from this feature.

Any feedback is welcome.



Re: Change the 'comment' content for column when execute command 'desc formatted table_name'

2018-11-07 Thread Jacky Li
The example is missing in my last mail, now I have put the example in 
CARBONDATA-3087   ,
please go to the JIRA and reply if you have any comment

Regards,
Jacky



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: Change the 'comment' content for column when execute command 'desc formatted table_name'

2018-11-07 Thread Jacky Li
Hi,

I revisit this discussion again, and suggest to change the DESC FORMATTED
output to following:



The information is outline in 6 sections:
1. Table basic information
2. Index information
3. Encoding information
4. Compaction information
5. Partition information (only for partition table)
6. Dynamic information

Please check whether it contains enough information of your preference, I
will create a JIRA and PR soon. 

Regards,
Jacky



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/