Hi,
I compared snappy and zstd compressor using TPCH for carbondata. For TPCH lineitem table: carbon-zstdcarbon-snappy loading (s)5351 size795MB1.2GB TPCH-query: Q14.2898.29 Q212.60912.986 Q314.90214.458 Q46.2765.954 Q523.14721.946 Q61.120.945 Q723.01728.007 Q814.55415.077 Q928.47227.473 Q1024.06724.682 Q113.3213.79 Q125.3115.185 Q1314.0811.84 Q142.2622.087 Q155.4964.772 Q1629.91929.833 Q177.0187.057 Q1817.36717.795 Q192.9312.865 Q2011.34710.937 Q2126.41628.414 Q225.9236.311 sum283.844290.704 As you can see, after using zstd, table size is 33% reduced comparing to snappy. And the data loading and query time difference is negligible. So I suggest to change the default compressor in carbondata from snappy to zstd. To change the default compressor, we need to: 1. append the compressor name in the carbondata file name. So that from the file name user can know what compressor is used. For example, file name will be changed from part-0-0_batchno0-0-0-1580982686749.carbondata to part-0-0_batchno0-0-0-1580982686749.snappy.carbondata or part-0-0_batchno0-0-0-1580982686749.zstd.carbondata 2. Change the compressor constant in CarbonCommonConstaint.java file to use zstd as default compressor What do you think? Regards, Jacky