Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2520#discussion_r204969813 --- Diff: docs/data-management-on-carbondata.md --- @@ -124,6 +124,41 @@ This tutorial is going to introduce all commands and data operations on CarbonDa TBLPROPERTIES ('streaming'='true') ``` + - **Local Dictionary Configuration** + + Local Dictionary is generated only for no-dictionary string/varchar datatype columns. It helps in: + 1. Getting more compression on dimension columns with less cardinality. + 2. Filter queries and full scan queries on No-dictionary columns with local dictionary will be faster as filter will be done on encoded data. + 3. Reducing the store size and memory footprint as only unique values will be stored as part of local dictionary and corresponding data will be stored as encoded data. + + By default, Local Dictionary will be enabled and generated for all no-dictionary string/varchar datatype columns. + + Users will be able to pass following properties in create table command: + + | Properties | Default value | Description | + | ---------- | ------------- | ----------- | + | LOCAL_DICTIONARY_ENABLE | true | By default, local dictionary will be enabled for the table | + | LOCAL_DICTIONARY_THRESHOLD | 10000 | The maximum cardinality for local dictionary generation (range- 1000 to 100000) | + | LOCAL_DICTIONARY_INCLUDE | all no-dictionary string/varchar columns | Columns for which Local Dictionary is generated. | + | LOCAL_DICTIONARY_EXCLUDE | none | Columns for which Local Dictionary is not generated | + --- End diff -- What about the limitations? Such as, can local dictionary columns work with: 1. sort_columns? 2. dictionary include? 3. complex? 3. etc.
---