+1 Good feature to add in CarbonData Regards, Jacky
> 在 2018年6月4日,下午11:10,Kumar Vishal <kumarvishal1...@gmail.com> 写道: > > Hi Community,Currently CarbonData supports global dictionary or > No-Dictionary (Plain-Text stored in LV format) for storing dimension column > data. > > *Bottleneck with Global Dictionary* > > 1. > > As dictionary file is mutable file, so it is not possible to support > global dictionary in storage environment which does not support append. > 2. > > It’s difficult for user to determine whether the column should be > dictionary or not if number of columns in table is high. > 3. > > Global dictionary generation generally slows down the load process > > *Bottleneck with No-Dictionary* > > 1. > > Storage size is high > 2. > > Query on No-Dictionary column is slower as data read/processed is more > 3. > > Filtering is slower on No-Dictionary columns as number of comparison is > high > 4. > > Memory footprint is high > > The above bottlenecks can be solved by *Generating Local dictionary for low > cardinality columns at each blocklet level, *which will help to achieve > below benefits: > > 1. > > This will help in supporting dictionary generation on different storage > environment irrespective of its supported operations(append) on the files. > 2. > > Reduces the extra IO operations read/write on the dictionary files > generated in case of global dictionary. > 3. > > It will eliminate the problem for user to identify the dictionary > columns when the number of columns are more in a table. > 4. > > It helps in getting more compression on dimension columns with less > cardinality. > 5. > > Filter query on No-dictionary columns with local dictionary will be > faster as filter will be done on encoded data. > 6. > > It will help in reducing the store size and memory footprint as only > unique values will be stored as part of local dictionary and > corresponding data will be stored as encoded data. > > Please provide your comment. Any suggestion from community is most > welcomed. Please let me know for any clarification. > > -Regards > Kumar Vishal