Github user vandana7 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1534#discussion_r152549583
  
    --- Diff: docs/data-management-on-carbondata.md ---
    @@ -0,0 +1,713 @@
    +<!--
    +    Licensed to the Apache Software Foundation (ASF) under one or more 
    +    contributor license agreements.  See the NOTICE file distributed with
    +    this work for additional information regarding copyright ownership. 
    +    The ASF licenses this file to you under the Apache License, Version 2.0
    +    (the "License"); you may not use this file except in compliance with 
    +    the License.  You may obtain a copy of the License at
    +
    +      http://www.apache.org/licenses/LICENSE-2.0
    +
    +    Unless required by applicable law or agreed to in writing, software 
    +    distributed under the License is distributed on an "AS IS" BASIS, 
    +    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
    +    See the License for the specific language governing permissions and 
    +    limitations under the License.
    +-->
    +
    +# Data Management on CarbonData
    +
    +This tutorial is going to introduce all commands and data operations on 
CarbonData.
    +
    +* [CREATE TABLE](#create-table)
    +* [TABLE MANAGEMENT](#table-management)
    +* [LOAD DATA](#load-data)
    +* [UPDATE AND DELETE](#update-and-delete)
    +* [COMPACTION](#compaction)
    +* [PARTITION](#partition)
    +* [BUCKETING](#bucketing)
    +* [SEGMENT MANAGEMENT](#segment-management)
    +
    +## CREATE TABLE
    +
    +  This command can be used to create a CarbonData table by specifying the 
list of fields along with the table properties.
    +  
    +  ```
    +  CREATE TABLE [IF NOT EXISTS] [db_name.]table_name[(col_name data_type , 
...)]
    +  STORED BY 'carbondata'
    +  [TBLPROPERTIES (property_name=property_value, ...)]
    +  ```  
    +  
    +### Usage Guidelines
    +
    +  Following are the guidelines for TBLPROPERTIES, CarbonData's additional 
table options can be set via carbon.properties.
    +  
    +   - **Dictionary Encoding Configuration**
    +
    +     Dictionary encoding is turned off for all columns by default from 1.3 
onwards, you can use this command for including columns to do dictionary 
encoding.
    +     Suggested use cases : do dictionary encoding for low cardinality 
columns, it might help to improve data compression ratio and performance.
    +
    +     ```
    +     TBLPROPERTIES ('DICTIONARY_INCLUDE'='column1, column2')
    +     ```
    +     
    +   - **Inverted Index Configuration**
    +
    +     By default inverted index is enabled, it might help to improve 
compression ratio and query speed, especially for low cardinality columns which 
are in reward position.
    +     Suggested use cases : For high cardinality columns, you can disable 
the inverted index for improving the data loading performance.
    +
    +     ```
    +     TBLPROPERTIES ('NO_INVERTED_INDEX'='column1, column3')
    +     ```
    +
    +   - **Sort Columns Configuration**
    +
    +     This property is for users to specify which columns belong to the 
MDK(Multi-Dimensions-Key) index.
    +     * If users don't specify "SORT_COLUMN" property, by default MDK index 
be built by using all dimension columns except complex datatype column. 
    +     * If this property is specified but with empty argument, then the 
table will be loaded without sort..
    +     Suggested use cases : Only build MDK index for required columns,it 
might help to improve the data loading performance.
    +
    +     ```
    +     TBLPROPERTIES ('SORT_COLUMNS'='column1, column3')
    +     OR
    +     TBLPROPERTIES ('SORT_COLUMNS'='')
    +     ```
    +
    +   - **Sort Scope Configuration**
    +   
    +     This property is for users to specify the scope of the sort during 
data load, following are the types of sort scope.
    +     
    +     * LOCAL_SORT: It is the default sort scope.             
    +     * NO_SORT: It will load the data in unsorted manner, it will 
significantly increase load performance.       
    +     * BATCH_SORT: It increases the load performance but decreases the 
query performance if identified blocks > parallelism.
    +     * GLOBAL_SORT: It increases the query performance, especially high 
concurrent point query.
    +       And if you care about loading resources isolation strictly, because 
the system uses the spark GroupBy to sort data, the resource can be controlled 
by spark. 
    + 
    +   - **Table Block Size Configuration**
    +
    +     This command is for setting block size of this table, the default 
value is 1024 MB and supports a range of 1 MB to 2048 MB.
    +
    +     ```
    +     TBLPROPERTIES ('TABLE_BLOCKSIZE'='512')
    +     //512 or 512M both are accepted.
    +     ```
    +
    +### Example:
    +    ```
    +    CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
    +                                   productNumber Int,
    +                                   productName String,
    +                                   storeCity String,
    +                                   storeProvince String,
    +                                   productCategory String,
    +                                   productBatch String,
    +                                   saleQuantity Int,
    +                                   revenue Int)
    +    STORED BY 'carbondata'
    +    TBLPROPERTIES ('DICTIONARY_INCLUDE'='productNumber',
    +                   'NO_INVERTED_INDEX'='productBatch',
    +                   'SORT_COLUMNS'='productName,storeCity',
    +                   'SORT_SCOPE'='NO_SORT',
    +                   'TABLE_BLOCKSIZE'='512')
    +    ```
    +        
    +## TABLE MANAGEMENT  
    +
    +### SHOW TABLE
    +
    +  This command can be used to list all the tables in current database or 
all the tables of a specific database.
    +  ```
    +  SHOW TABLES [IN db_Name]
    +  ```
    +
    +  Example:
    +  ```
    +  SHOT TABLES
    --- End diff --
    
    SHOW TABLES


---

Reply via email to