+1 I agree.
About non-dictionary column of sort_columns: 1. sort column data in ColumnChunk2 2. compress column by datatype string: RLE or snappy (if RLE is not good) short, int, bigint: Delta and number compressor (ValueCompressor and NumberCompressor) float, double: Delta and snappy (ValueCompressor and SnappyCompressor) 3. store column by datatype: string : byte[], use null character separator short, int, bigint: byte[], use max/min to calculate a fixed length to store delta value float, double: byte[], uncompressed to float[] or double[] 4. filter column column level: ExcludeFilterExecuterImpl, IncludeFilterExecuterImpl, RangeFilterExecuter RangeFilterExecuter of column level should calculate the index range(start and end) of sorted data chunk to get bitset of uncompressed result. @Ravindra please correct me -- View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Improving-Non-dictionary-storage-performance-tp8146p8412.html Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.