[carbondata] branch master updated: [DOCUMENTATION] Document update for new configurations.

kunalkapoor Mon, 08 Jul 2019 03:29:24 -0700

This is an automated email from the ASF dual-hosted git repository.

kunalkapoor pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git



The following commit(s) were added to refs/heads/master by this push:
     new c179195  [DOCUMENTATION] Document update for new configurations.
c179195 is described below

commit c179195f715132dc407347c828cd29ad4f697649
Author: manishnalla1994 <manish.nalla1...@gmail.com>
AuthorDate: Tue Jul 2 11:29:19 2019 +0530

    [DOCUMENTATION] Document update for new configurations.
    
    Added documentation for new configurations.
    
    This closes #3314
---
 docs/configuration-parameters.md | 4 ++++
 docs/ddl-of-carbondata.md        | 2 +-
 docs/dml-of-carbondata.md        | 6 ++++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/docs/configuration-parameters.md b/docs/configuration-parameters.md
index 7b31413..808d507 100644
--- a/docs/configuration-parameters.md
+++ b/docs/configuration-parameters.md
@@ -48,6 +48,7 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.invisible.segments.preserve.count | 200 | CarbonData maintains each 
data load entry in tablestatus file. The entries from this file are not deleted 
for those segments that are compacted or dropped, but are made invisible. If 
the number of data loads are very high, the size and number of entries in 
tablestatus file can become too many causing unnecessary reading of all data. 
This configuration specifies the number of segment entries to be maintained 
afte they are compacted or dro [...]
 | carbon.lock.retries | 3 | CarbonData ensures consistency of operations by 
blocking certain operations from running in parallel. In order to block the 
operations from running in parallel, lock is obtained on the table. This 
configuration specifies the maximum number of retries to obtain the lock for 
any operations other than load. **NOTE:** Data manupulation operations like 
Compaction,UPDATE,DELETE  or LOADING,UPDATE,DELETE are not allowed to run in 
parallel. How ever data loading can h [...]
 | carbon.lock.retry.timeout.sec | 5 | Specifies the interval between the 
retries to obtain the lock for any operation other than load. **NOTE:** Refer 
to ***carbon.lock.retries*** for understanding why CarbonData uses locks for 
operations. |
+| carbon.fs.custom.file.provider | None | To support FileTypeInterface for 
configuring custom CarbonFile implementation to work with custom FileSystem. |
 
 ## Data Loading Configuration
 
@@ -93,6 +94,8 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.options.serialization.null.format | \N | Based on the business 
scenarios, some columns might need to be loaded with null values. As null value 
cannot be written in csv files, some special characters might be adopted to 
specify null values. This configuration can be used to specify the null values 
format in the data being loaded. |
 | carbon.column.compressor | snappy | CarbonData will compress the column 
values using the compressor specified by this configuration. Currently 
CarbonData supports 'snappy', 'zstd' and 'gzip' compressors. |
 | carbon.minmax.allowed.byte.count | 200 | CarbonData will write the min max 
values for string/varchar types column using the byte count specified by this 
configuration. Max value is 1000 bytes(500 characters) and Min value is 10 
bytes(5 characters). **NOTE:** This property is useful for reducing the store 
size thereby improving the query performance but can lead to query degradation 
if value is not configured properly. | |
+| carbon.merge.index.failure.throw.exception | true | It is used to configure 
whether or not merge index failure should result in data load failure also. |
+| carbon.binary.decoder | None | Support configurable decode for loading. Two 
decoders supported: base64 and hex |
 
 ## Compaction Configuration
 
@@ -112,6 +115,7 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.concurrent.compaction | true | Compaction of different tables can be 
executed concurrently. This configuration determines whether to compact all 
qualifying tables in parallel or not. **NOTE: **Compacting concurrently is a 
resource demanding operation and needs more resources there by affecting the 
query performance also. This configuration is **deprecated** and might be 
removed in future releases. |
 | carbon.compaction.prefetch.enable | false | Compaction operation is similar 
to Query + data load where in data from qualifying segments are queried and 
data loading performed to generate a new single segment. This configuration 
determines whether to query ahead data from segments and feed it for data 
loading. **NOTE: **This configuration is disabled by default as it needs extra 
resources for querying extra data. Based on the memory availability on the 
cluster, user can enable it to imp [...]
 | carbon.merge.index.in.segment | true | Each CarbonData file has a companion 
CarbonIndex file which maintains the metadata about the data. These CarbonIndex 
files are read and loaded into driver and is used subsequently for pruning of 
data during queries. These CarbonIndex files are very small in size(few KB) and 
are many. Reading many small files from HDFS is not efficient and leads to slow 
IO performance. Hence these CarbonIndex files belonging to a segment can be 
combined into  a sin [...]
+| carbon.enable.range.compaction | true | To configure Ranges-based Compaction 
to be used or not for RANGE_COLUMN. If true after compaction also the data 
would be present in ranges. |
 
 ## Query Configuration
 
diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 2495bf6..7ab0e5f 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -165,7 +165,7 @@ CarbonData DDL statements are documented here,which 
includes:
    
    | Properties | Default value | Description |
    | ---------- | ------------- | ----------- |
-   | carbon.local.dictionary.enable | false | By default, Local Dictionary 
will be disabled for the carbondata table. |
+   | carbon.local.dictionary.enable | true | By default, Local Dictionary will 
be enabled for the carbondata table. |
    | carbon.local.dictionary.decoder.fallback | true | Page Level data will 
not be maintained for the blocklet. During fallback, actual data will be 
retrieved from the encoded page data using local dictionary. **NOTE:** Memory 
footprint decreases significantly as compared to when this property is set to 
false |
     
    Local Dictionary can be configured using the following properties during 
create table command: 
diff --git a/docs/dml-of-carbondata.md b/docs/dml-of-carbondata.md
index 3e2a22d..84c629c 100644
--- a/docs/dml-of-carbondata.md
+++ b/docs/dml-of-carbondata.md
@@ -70,6 +70,7 @@ CarbonData DML statements are documented here,which includes:
 | [IS_EMPTY_DATA_BAD_RECORD](#bad-records-handling)       | Whether empty data 
of a column to be considered as bad record or not |
 | [GLOBAL_SORT_PARTITIONS](#global_sort_partitions)       | Number of 
partition to use for shuffling of data during sorting |
 | [SCALE_FACTOR](#scale_factor)                           | Control the 
partition size for RANGE_COLUMN feature          |
+| [CARBON_OPTIONS_BINARY_DECODER]                         | Support 
configurable decode for loading from csv             |
 -
   You can use the following options to load data:
 
@@ -307,6 +308,11 @@ CarbonData DML statements are documented here,which 
includes:
    * If both GLOBAL_SORT_PARTITIONS and SCALE_FACTOR are used at the same 
time, only GLOBAL_SORT_PARTITIONS is valid.
    * The compaction on RANGE_COLUMN will use LOCAL_SORT by default.
 
+   - ##### CARBON_ENABLE_RANGE_COMPACTION
+
+   To configure Ranges-based Compaction to be used or not for RANGE_COLUMN.
+   The default value is 'true'.
+
 ### INSERT DATA INTO CARBONDATA TABLE
 
   This command inserts data into a CarbonData table, it is defined as a 
combination of two queries Insert and Select query respectively.

[carbondata] branch master updated: [DOCUMENTATION] Document update for new configurations.

Reply via email to