[1/3] incubator-carbondata-site git commit: update dml document

chenliang613 Wed, 12 Apr 2017 05:26:58 -0700

Repository: incubator-carbondata-site
Updated Branches:
  refs/heads/asf-site b96a419a5 -> 2f826c1b5



http://git-wip-us.apache.org/repos/asf/incubator-carbondata-site/blob/2f826c1b/src/main/webapp/quick-start-guide.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/quick-start-guide.html 
b/src/main/webapp/quick-start-guide.html
index 0c58684..0246f64 100644
--- a/src/main/webapp/quick-start-guide.html
+++ b/src/main/webapp/quick-start-guide.html
@@ -156,21 +156,17 @@
                             <div class="row">
                                 <div class="col-sm-12  col-md-12">
                                     <div>
-
 <h1>
 <a id="quick-start" class="anchor" href="#quick-start" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Quick Start</h1>
-
 <p>This tutorial provides a quick introduction to using CarbonData.</p>
-
 <h2>
 <a id="prerequisites" class="anchor" href="#prerequisites" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Prerequisites</h2>
-
 <ul>
 <li>
-<a href="https://github.com/apache/incubator-carbondata/blob/master/build"; 
target=_blank>Installation and building CarbonData</a>.</li>
+<p><a href="https://github.com/apache/incubator-carbondata/blob/master/build"; 
target=_blank>Installation and building CarbonData</a>.</p>
+</li>
 <li>
 <p>Create a sample.csv file using the following commands. The CSV file is 
required for loading data into CarbonData.</p>
-
 <pre><code>cd carbondata
 cat &gt; sample.csv &lt;&lt; EOF
 id,name,city,age
@@ -181,124 +177,82 @@ EOF
 </code></pre>
 </li>
 </ul>
-
 <h2>
 <a id="interactive-analysis-with-spark-shell-version-21" class="anchor" 
href="#interactive-analysis-with-spark-shell-version-21" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Interactive Analysis with Spark Shell Version 2.1</h2>
-
 <p>Apache Spark Shell provides a simple way to learn the API, as well as a 
powerful tool to analyze data interactively. Please visit <a 
href="http://spark.apache.org/docs/latest/"; target=_blank>Apache Spark 
Documentation</a> for more details on Spark shell.</p>
-
 <h4>
 <a id="basics" class="anchor" href="#basics" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Basics</h4>
-
 <p>Start Spark shell by running the following command in the Spark 
directory:</p>
-
 <pre><code>./bin/spark-shell --jars &lt;carbondata assembly jar path&gt;
 </code></pre>
-
 <p><strong>NOTE</strong>: Assembly jar will be available after <a 
href="https://github.com/apache/incubator-carbondata/blob/master/build/README.md";
 target=_blank>building CarbonData</a> and can be copied from 
<code>./assembly/target/scala-2.1x/carbondata_xxx.jar</code></p>
-
 <p>In this shell, SparkSession is readily available as <code>spark</code> and 
Spark context is readily available as <code>sc</code>.</p>
-
 <p>In order to create a CarbonSession we will have to configure it explicitly 
in the following manner :</p>
-
 <ul>
 <li>Import the following :</li>
 </ul>
-
 <pre><code>import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.CarbonSession._
 </code></pre>
-
 <ul>
 <li>Create a CarbonSession :</li>
 </ul>
-
 <pre><code>val carbon = 
SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("&lt;hdfs 
store path&gt;")
 </code></pre>
-
 <p><strong>NOTE</strong>: By default metastore location is pointed to 
<code>../carbon.metastore</code>, user can provide own metastore location to 
CarbonSession like 
<code>SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("&lt;hdfs
 store path&gt;", "&lt;local metastore path&gt;")</code></p>
-
 <h4>
 <a id="executing-queries" class="anchor" href="#executing-queries" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Executing Queries</h4>
-
 <h6>
 <a id="creating-a-table" class="anchor" href="#creating-a-table" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Creating a Table</h6>
-
 <pre><code>scala&gt;carbon.sql("CREATE TABLE IF NOT EXISTS test_table(id 
string, name string, city string, age Int) STORED BY 'carbondata'")
 </code></pre>
-
 <h6>
 <a id="loading-data-to-a-table" class="anchor" href="#loading-data-to-a-table" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Loading Data to a Table</h6>
-
 <pre><code>scala&gt;carbon.sql("LOAD DATA INPATH 'sample.csv file path' INTO 
TABLE test_table")
 </code></pre>
-
 <p><strong>NOTE</strong>: Please provide the real file path of 
<code>sample.csv</code> for the above script.</p>
-
 <h6>
 <a id="query-data-from-a-table" class="anchor" href="#query-data-from-a-table" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Query Data from a Table</h6>
-
 <pre><code>scala&gt;carbon.sql("SELECT * FROM test_table").show()
 
 scala&gt;carbon.sql("SELECT city, avg(age), sum(age) FROM test_table GROUP BY 
city").show()
 </code></pre>
-
 <h2>
 <a id="interactive-analysis-with-spark-shell-version-16" class="anchor" 
href="#interactive-analysis-with-spark-shell-version-16" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Interactive Analysis with Spark Shell Version 1.6</h2>
-
 <h4>
 <a id="basics-1" class="anchor" href="#basics-1" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Basics</h4>
-
 <p>Start Spark shell by running the following command in the Spark 
directory:</p>
-
 <pre><code>./bin/spark-shell --jars &lt;carbondata assembly jar path&gt;
 </code></pre>
-
 <p><strong>NOTE</strong>: Assembly jar will be available after <a 
href="https://github.com/apache/incubator-carbondata/blob/master/build/README.md";
 target=_blank>building CarbonData</a> and can be copied from 
<code>./assembly/target/scala-2.1x/carbondata_xxx.jar</code></p>
-
 <p><strong>NOTE</strong>: In this shell, SparkContext is readily available as 
<code>sc</code>.</p>
-
 <ul>
 <li>In order to execute the Queries we need to import CarbonContext:</li>
 </ul>
-
 <pre><code>import org.apache.spark.sql.CarbonContext
 </code></pre>
-
 <ul>
 <li>Create an instance of CarbonContext in the following manner :</li>
 </ul>
-
 <pre><code>val cc = new CarbonContext(sc, "&lt;hdfs store path&gt;")
 </code></pre>
-
 <p><strong>NOTE</strong>: If running on local machine without hdfs, configure 
the local machine's store path instead of hdfs store path</p>
-
 <h4>
 <a id="executing-queries-1" class="anchor" href="#executing-queries-1" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Executing Queries</h4>
-
 <h6>
 <a id="creating-a-table-1" class="anchor" href="#creating-a-table-1" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Creating a Table</h6>
-
 <pre><code>scala&gt;cc.sql("CREATE TABLE IF NOT EXISTS test_table (id string, 
name string, city string, age Int) STORED BY 'carbondata'")
 </code></pre>
-
 <p>To see the table created :</p>
-
 <pre><code>scala&gt;cc.sql("SHOW TABLES").show()
 </code></pre>
-
 <h6>
 <a id="loading-data-to-a-table-1" class="anchor" 
href="#loading-data-to-a-table-1" aria-hidden="true"><span aria-hidden="true" 
class="octicon octicon-link"></span></a>Loading Data to a Table</h6>
-
 <pre><code>scala&gt;cc.sql("LOAD DATA INPATH 'sample.csv file path' INTO TABLE 
test_table")
 </code></pre>
-
 <p><strong>NOTE</strong>: Please provide the real file path of 
<code>sample.csv</code> for the above script.</p>
-
 <h6>
 <a id="query-data-from-a-table-1" class="anchor" 
href="#query-data-from-a-table-1" aria-hidden="true"><span aria-hidden="true" 
class="octicon octicon-link"></span></a>Query Data from a Table</h6>
-
 <pre><code>scala&gt;cc.sql("SELECT * FROM test_table").show()
 scala&gt;cc.sql("SELECT city, avg(age), sum(age) FROM test_table GROUP BY 
city").show()
 </code></pre>

http://git-wip-us.apache.org/repos/asf/incubator-carbondata-site/blob/2f826c1b/src/main/webapp/supported-data-types-in-carbondata.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/supported-data-types-in-carbondata.html 
b/src/main/webapp/supported-data-types-in-carbondata.html
index 13e640f..b56bc59 100644
--- a/src/main/webapp/supported-data-types-in-carbondata.html
+++ b/src/main/webapp/supported-data-types-in-carbondata.html
@@ -156,17 +156,13 @@
                             <div class="row">
                                 <div class="col-sm-12  col-md-12">
                                     <div>
-
 <h1>
 <a id="data-types" class="anchor" href="#data-types" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Data Types</h1>
-
 <h4>
 <a id="carbondata-supports-the-following-data-types" class="anchor" 
href="#carbondata-supports-the-following-data-types" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>CarbonData supports 
the following data types:</h4>
-
 <ul>
 <li>
 <p>Numeric Types</p>
-
 <ul>
 <li>SMALLINT</li>
 <li>INT/INTEGER</li>
@@ -177,7 +173,6 @@
 </li>
 <li>
 <p>Date/Time Types</p>
-
 <ul>
 <li>TIMESTAMP</li>
 <li>DATE</li>
@@ -185,7 +180,6 @@
 </li>
 <li>
 <p>String Types</p>
-
 <ul>
 <li>STRING</li>
 <li>CHAR</li>
@@ -193,7 +187,6 @@
 </li>
 <li>
 <p>Complex Types</p>
-
 <ul>
 <li>arrays: ARRAY<code>&lt;data_type&gt;</code>
 </li>

http://git-wip-us.apache.org/repos/asf/incubator-carbondata-site/blob/2f826c1b/src/main/webapp/troubleshooting.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/troubleshooting.html 
b/src/main/webapp/troubleshooting.html
index 96ac1f2..48f8871 100644
--- a/src/main/webapp/troubleshooting.html
+++ b/src/main/webapp/troubleshooting.html
@@ -156,250 +156,179 @@
                             <div class="row">
                                 <div class="col-sm-12  col-md-12">
                                     <div>
-
 <h1>
 <a id="troubleshooting" class="anchor" href="#troubleshooting" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Troubleshooting</h1>
-
 <p>This tutorial is designed to provide troubleshooting for end users and 
developers
 who are building, deploying, and using CarbonData.</p>
-
 <h2>
 <a id="failed-to-load-thrift-libraries" class="anchor" 
href="#failed-to-load-thrift-libraries" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Failed to load 
thrift libraries</h2>
-
 <p><strong>Symptom</strong></p>
-
 <p>Thrift throws following exception :</p>
-
-<pre><code>  thrift: error while loading shared libraries:
-  libthriftc.so.0: cannot open shared object file: No such file or directory
+<pre><code>thrift: error while loading shared libraries:
+libthriftc.so.0: cannot open shared object file: No such file or directory
 </code></pre>
-
 <p><strong>Possible Cause</strong></p>
-
 <p>The complete path to the directory containing the libraries is not 
configured correctly.</p>
-
 <p><strong>Procedure</strong></p>
-
 <p>Follow the Apache thrift docs at <a 
href="https://thrift.apache.org/docs/install"; 
target=_blank>https://thrift.apache.org/docs/install</a> to install thrift 
correctly.</p>
-
 <h2>
 <a id="failed-to-launch-the-spark-shell" class="anchor" 
href="#failed-to-launch-the-spark-shell" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Failed to launch the 
Spark Shell</h2>
-
 <p><strong>Symptom</strong></p>
-
 <p>The shell prompts the following error :</p>
-
-<pre><code>  
org.apache.spark.sql.CarbonContext$$anon$$apache$spark$sql$catalyst$analysis
-  $OverrideCatalog$_setter_$org$apache$spark$sql$catalyst$analysis
-  $OverrideCatalog$$overrides_$e
+<pre><code>org.apache.spark.sql.CarbonContext$$anon$$apache$spark$sql$catalyst$analysis
+$OverrideCatalog$_setter_$org$apache$spark$sql$catalyst$analysis
+$OverrideCatalog$$overrides_$e
 </code></pre>
-
 <p><strong>Possible Cause</strong></p>
-
 <p>The Spark Version and the selected Spark Profile do not match.</p>
-
 <p><strong>Procedure</strong></p>
-
 <ol>
-<li><p>Ensure your spark version and selected profile for spark are 
correct.</p></li>
+<li>
+<p>Ensure your spark version and selected profile for spark are correct.</p>
+</li>
 <li>
 <p>Use the following command :</p>
-
-<pre><code> "mvn -Pspark-2.1 -Dspark.version {yourSparkVersion} clean package"
-</code></pre>
-
-<p>Note :  Refrain from using "mvn clean package" without specifying the 
profile.</p>
 </li>
 </ol>
+<pre><code>```
+ "mvn -Pspark-2.1 -Dspark.version {yourSparkVersion} clean package"
+```
 
+Note :  Refrain from using "mvn clean package" without specifying the profile.
+</code></pre>
 <h2>
 <a id="failed-to-execute-load-query-on-cluster" class="anchor" 
href="#failed-to-execute-load-query-on-cluster" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Failed to execute 
load query on cluster.</h2>
-
 <p><strong>Symptom</strong></p>
-
 <p>Load query failed with the following exception:</p>
-
-<pre><code>  Dictionary file is locked for updation.
+<pre><code>Dictionary file is locked for updation.
 </code></pre>
-
 <p><strong>Possible Cause</strong></p>
-
 <p>The carbon.properties file is not identical in all the nodes of the 
cluster.</p>
-
 <p><strong>Procedure</strong></p>
-
 <p>Follow the steps to ensure the carbon.properties file is consistent across 
all the nodes:</p>
-
 <ol>
-<li><p>Copy the carbon.properties file from the master node to all the other 
nodes in the cluster.
- For example, you can use ssh to copy this file to all the nodes.</p></li>
-<li><p>For the changes to take effect, restart the Spark cluster.</p></li>
+<li>
+<p>Copy the carbon.properties file from the master node to all the other nodes 
in the cluster.
+For example, you can use ssh to copy this file to all the nodes.</p>
+</li>
+<li>
+<p>For the changes to take effect, restart the Spark cluster.</p>
+</li>
 </ol>
-
 <h2>
 <a id="failed-to-execute-insert-query-on-cluster" class="anchor" 
href="#failed-to-execute-insert-query-on-cluster" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Failed to execute 
insert query on cluster.</h2>
-
 <p><strong>Symptom</strong></p>
-
 <p>Load query failed with the following exception:</p>
-
-<pre><code>  Dictionary file is locked for updation.
+<pre><code>Dictionary file is locked for updation.
 </code></pre>
-
 <p><strong>Possible Cause</strong></p>
-
 <p>The carbon.properties file is not identical in all the nodes of the 
cluster.</p>
-
 <p><strong>Procedure</strong></p>
-
 <p>Follow the steps to ensure the carbon.properties file is consistent across 
all the nodes:</p>
-
 <ol>
-<li><p>Copy the carbon.properties file from the master node to all the other 
nodes in the cluster.
-   For example, you can use scp to copy this file to all the nodes.</p></li>
-<li><p>For the changes to take effect, restart the Spark cluster.</p></li>
+<li>
+<p>Copy the carbon.properties file from the master node to all the other nodes 
in the cluster.
+For example, you can use scp to copy this file to all the nodes.</p>
+</li>
+<li>
+<p>For the changes to take effect, restart the Spark cluster.</p>
+</li>
 </ol>
-
 <h2>
 <a id="failed-to-connect-to-hiveuser-with-thrift" class="anchor" 
href="#failed-to-connect-to-hiveuser-with-thrift" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Failed to connect to 
hiveuser with thrift</h2>
-
 <p><strong>Symptom</strong></p>
-
 <p>We get the following exception :</p>
-
-<pre><code>  Cannot connect to hiveuser.
+<pre><code>Cannot connect to hiveuser.
 </code></pre>
-
 <p><strong>Possible Cause</strong></p>
-
 <p>The external process does not have permission to access.</p>
-
 <p><strong>Procedure</strong></p>
-
 <p>Ensure that the Hiveuser in mysql must allow its access to the external 
processes.</p>
-
 <h2>
 <a id="failure-to-read-the-metastore-db-during-table-creation" class="anchor" 
href="#failure-to-read-the-metastore-db-during-table-creation" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Failure to read the metastore db during table 
creation.</h2>
-
 <p><strong>Symptom</strong></p>
-
 <p>We get the following exception on trying to connect :</p>
-
-<pre><code>  Cannot read the metastore db
+<pre><code>Cannot read the metastore db
 </code></pre>
-
 <p><strong>Possible Cause</strong></p>
-
 <p>The metastore db is dysfunctional.</p>
-
 <p><strong>Procedure</strong></p>
-
 <p>Remove the metastore db from the carbon.metastore in the Spark 
Directory.</p>
-
 <h2>
 <a id="failed-to-load-data-on-the-cluster" class="anchor" 
href="#failed-to-load-data-on-the-cluster" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Failed to load data 
on the cluster</h2>
-
 <p><strong>Symptom</strong></p>
-
 <p>Data loading fails with the following exception :</p>
-
-<pre><code>   Data Load failure exeception
+<pre><code>Data Load failure exeception
 </code></pre>
-
 <p><strong>Possible Cause</strong></p>
-
 <p>The following issue can cause the failure :</p>
-
 <ol>
-<li><p>The core-site.xml, hive-site.xml, yarn-site and carbon.properties are 
not consistent across all nodes of the cluster.</p></li>
+<li>
+<p>The core-site.xml, hive-site.xml, yarn-site and carbon.properties are not 
consistent across all nodes of the cluster.</p>
+</li>
 <li>
 <p>Path to hdfs ddl is not configured correctly in the carbon.properties.</p>
-
+</li>
+</ol>
 <p><strong>Procedure</strong></p>
-
 <p>Follow the steps to ensure the following configuration files are consistent 
across all the nodes:</p>
-
 <ol>
 <li>
 <p>Copy the core-site.xml, hive-site.xml, yarn-site,carbon.properties files 
from the master node to all the other nodes in the cluster.
 For example, you can use scp to copy this file to all the nodes.</p>
-
 <p>Note : Set the path to hdfs ddl in carbon.properties in the master node.</p>
 </li>
-<li><p>For the changes to take effect, restart the Spark cluster.</p></li>
-</ol>
+<li>
+<p>For the changes to take effect, restart the Spark cluster.</p>
 </li>
 </ol>
-
 <h2>
 <a id="failed-to-insert-data-on-the-cluster" class="anchor" 
href="#failed-to-insert-data-on-the-cluster" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Failed to insert 
data on the cluster</h2>
-
 <p><strong>Symptom</strong></p>
-
 <p>Insertion fails with the following exception :</p>
-
-<pre><code>   Data Load failure exeception
+<pre><code>Data Load failure exeception
 </code></pre>
-
 <p><strong>Possible Cause</strong></p>
-
 <p>The following issue can cause the failure :</p>
-
 <ol>
-<li><p>The core-site.xml, hive-site.xml, yarn-site and carbon.properties are 
not consistent across all nodes of the cluster.</p></li>
+<li>
+<p>The core-site.xml, hive-site.xml, yarn-site and carbon.properties are not 
consistent across all nodes of the cluster.</p>
+</li>
 <li>
 <p>Path to hdfs ddl is not configured correctly in the carbon.properties.</p>
-
+</li>
+</ol>
 <p><strong>Procedure</strong></p>
-
 <p>Follow the steps to ensure the following configuration files are consistent 
across all the nodes:</p>
-
 <ol>
 <li>
 <p>Copy the core-site.xml, hive-site.xml, yarn-site,carbon.properties files 
from the master node to all the other nodes in the cluster.
 For example, you can use scp to copy this file to all the nodes.</p>
-
 <p>Note : Set the path to hdfs ddl in carbon.properties in the master node.</p>
 </li>
-<li><p>For the changes to take effect, restart the Spark cluster.</p></li>
-</ol>
+<li>
+<p>For the changes to take effect, restart the Spark cluster.</p>
 </li>
 </ol>
-
 <h2>
 <a 
id="failed-to-execute-concurrent-operationsloadinsertupdate-on-table-by-multiple-workers"
 class="anchor" 
href="#failed-to-execute-concurrent-operationsloadinsertupdate-on-table-by-multiple-workers"
 aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Failed to execute Concurrent 
Operations(Load,Insert,Update) on table by multiple workers.</h2>
-
 <p><strong>Symptom</strong></p>
-
 <p>Execution fails with the following exception :</p>
-
-<pre><code>   Table is locked for updation.
+<pre><code>Table is locked for updation.
 </code></pre>
-
 <p><strong>Possible Cause</strong></p>
-
 <p>Concurrency not supported.</p>
-
 <p><strong>Procedure</strong></p>
-
 <p>Worker must wait for the query execution to complete and the table to 
release the lock for another query execution to succeed.</p>
-
 <h2>
 <a id="failed-to-create-a-table-with-a-single-numeric-column" class="anchor" 
href="#failed-to-create-a-table-with-a-single-numeric-column" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Failed to create a table with a single numeric 
column.</h2>
-
 <p><strong>Symptom</strong></p>
-
 <p>Execution fails with the following exception :</p>
-
-<pre><code>   Table creation fails.
+<pre><code>Table creation fails.
 </code></pre>
-
 <p><strong>Possible Cause</strong></p>
-
 <p>Behaviour not supported.</p>
-
 <p><strong>Procedure</strong></p>
-
 <p>A single column that can be considered as dimension is mandatory for table 
creation.</p>
 </div>
 </div>

http://git-wip-us.apache.org/repos/asf/incubator-carbondata-site/blob/2f826c1b/src/main/webapp/useful-tips-on-carbondata.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/useful-tips-on-carbondata.html 
b/src/main/webapp/useful-tips-on-carbondata.html
index 977fe40..39e6b3c 100644
--- a/src/main/webapp/useful-tips-on-carbondata.html
+++ b/src/main/webapp/useful-tips-on-carbondata.html
@@ -156,29 +156,21 @@
                             <div class="row">
                                 <div class="col-sm-12  col-md-12">
                                     <div>
-
 <h1>
 <a id="useful-tips" class="anchor" href="#useful-tips" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Useful Tips</h1>
-
 <p>This tutorial guides you to create CarbonData Tables and optimize 
performance.
 The following sections will elaborate on the above topics :</p>
-
 <ul>
 <li><a href="#suggestions-to-create-carbondata-table">Suggestions to create 
CarbonData Table</a></li>
 <li><a 
href="#configurations-for-optimizing-carbondata-performance">Configurations For 
Optimizing CarbonData Performance</a></li>
 </ul>
-
 <h2>
 <a id="suggestions-to-create-carbondata-table" class="anchor" 
href="#suggestions-to-create-carbondata-table" aria-hidden="true"><span 
aria-hidden="true" class="octicon octicon-link"></span></a>Suggestions to 
Create CarbonData Table</h2>
-
 <p>Recently CarbonData was used to analyze performance of Telecommunication 
field.
 The results of the analysis for table creation with dimensions ranging from
-10 thousand to 10 billion rows and 100 to 300 columns have been summarized 
below.  </p>
-
+10 thousand to 10 billion rows and 100 to 300 columns have been summarized 
below.</p>
 <p>The following table describes some of the columns from the table used.</p>
-
 <p><strong>Table Column Description</strong></p>
-
 <table>
 <thead>
 <tr>
@@ -233,18 +225,14 @@ The results of the analysis for table creation with 
dimensions ranging from
 </tr>
 </tbody>
 </table>
-
 <p>CarbonData has more than 50 test cases, on the basis of these we have 
following suggestions to enhance the query performance :</p>
-
 <ul>
 <li>
 <p><strong>Put the frequently-used column filter in the beginning</strong></p>
-
-<p>For example, MSISDN filter is used in most of the query then we must put 
the MSISDN in the first column. 
+<p>For example, MSISDN filter is used in most of the query then we must put 
the MSISDN in the first column.
 The create table command can be modified as suggested below :</p>
 </li>
 </ul>
-
 <pre><code>  create table carbondata_table(
   msisdn String,
   ...
@@ -252,23 +240,18 @@ The create table command can be modified as suggested 
below :</p>
   TBLPROPERTIES ( 'DICTIONARY_EXCLUDE'='MSISDN,..',
   'DICTIONARY_INCLUDE'='...');
 </code></pre>
-
 <p>Now the query with MSISDN in the filter will be more efficient.</p>
-
 <ul>
 <li>
 <p><strong>Put the frequently-used columns in the order of low to high 
cardinality</strong></p>
-
 <p>If the table in the specified query has multiple columns which are 
frequently used to filter the results, it is suggested to put
-the columns in the order of cardinality low to high. This ordering of 
frequently used columns improves the compression ratio and 
+the columns in the order of cardinality low to high. This ordering of 
frequently used columns improves the compression ratio and
 enhances the performance of queries with filter on these columns.</p>
-
-<p>For example if MSISDN, HOST and Dime_1 are frequently-used columns, then 
the column order of table is suggested as 
-Dime_1&gt;HOST&gt;MSISDN as Dime_1 has the lowest cardinality. 
+<p>For example if MSISDN, HOST and Dime_1 are frequently-used columns, then 
the column order of table is suggested as
+Dime_1&gt;HOST&gt;MSISDN as Dime_1 has the lowest cardinality.
 The create table command can be modified as suggested below :</p>
 </li>
 </ul>
-
 <pre><code>  create table carbondata_table(
   Dime_1 String,
   HOST String,
@@ -278,16 +261,13 @@ The create table command can be modified as suggested 
below :</p>
   TBLPROPERTIES ( 'DICTIONARY_EXCLUDE'='MSISDN,HOST..',
   'DICTIONARY_INCLUDE'='Dime_1..');
 </code></pre>
-
 <ul>
 <li>
 <p><strong>Put the Dimension type columns in order of low to high 
cardinality</strong></p>
-
 <p>If the columns used to filter are not frequently used, then it is suggested 
to order all the columns of dimension type in order of low to high cardinality.
 The create table command can be modified as below :</p>
 </li>
 </ul>
-
 <pre><code>  create table carbondata_table(
   Dime_1 String,
   BEGIN_TIME bigint
@@ -298,16 +278,13 @@ The create table command can be modified as below :</p>
   TBLPROPERTIES ( 'DICTIONARY_EXCLUDE'='MSISDN,HOST,IMSI..',
   'DICTIONARY_INCLUDE'='Dime_1,END_TIME,BEGIN_TIME..');
 </code></pre>
-
 <ul>
 <li>
 <p><strong>For measure type columns with non high accuracy, replace 
Numeric(20,0) data type with Double data type</strong></p>
-
-<p>For columns of measure type, not requiring high accuracy, it is suggested 
to replace Numeric data type with Double to enhance 
+<p>For columns of measure type, not requiring high accuracy, it is suggested 
to replace Numeric data type with Double to enhance
 query performance. The create table command can be modified as below :</p>
 </li>
 </ul>
-
 <pre><code>  create table carbondata_table(
   Dime_1 String,
   BEGIN_TIME bigint
@@ -321,20 +298,15 @@ query performance. The create table command can be 
modified as below :</p>
   TBLPROPERTIES ( 'DICTIONARY_EXCLUDE'='MSISDN,HOST,IMSI',
   'DICTIONARY_INCLUDE'='Dime_1,END_TIME,BEGIN_TIME');
 </code></pre>
-
 <p>The result of performance analysis of test-case shows reduction in query 
execution time from 15 to 3 seconds, thereby improving performance by nearly 5 
times.</p>
-
 <ul>
 <li>
 <p><strong>Columns of incremental character should be re-arranged at the end 
of dimensions</strong></p>
-
 <p>Consider the following scenario where data is loaded each day and the 
start_time is incremental for each load, it is
-suggested to put start_time at the end of dimensions. </p>
-
+suggested to put start_time at the end of dimensions.</p>
 <p>Incremental values are efficient in using min/max index. The create table 
command can be modified as below :</p>
 </li>
 </ul>
-
 <pre><code>  create table carbondata_table(
   Dime_1 String,
   HOST String,
@@ -348,26 +320,20 @@ suggested to put start_time at the end of dimensions. </p>
   TBLPROPERTIES ( 'DICTIONARY_EXCLUDE'='MSISDN,HOST,IMSI',
   'DICTIONARY_INCLUDE'='Dime_1,END_TIME,BEGIN_TIME'); 
 </code></pre>
-
 <ul>
 <li>
 <p><strong>Avoid adding high cardinality columns to dictionary</strong></p>
-
-<p>If the system has low memory configuration, then it is suggested to exclude 
high cardinality columns from the dictionary to 
-enhance load performance. Creation of  dictionary for high cardinality columns 
at time of load will degrade load performance due to 
-excessive memory usage. </p>
-
+<p>If the system has low memory configuration, then it is suggested to exclude 
high cardinality columns from the dictionary to
+enhance load performance. Creation of  dictionary for high cardinality columns 
at time of load will degrade load performance due to
+excessive memory usage.</p>
 <p>By default CarbonData determines the cardinality at the first data load and 
allows for dictionary creation only if the cardinality is less than
 1 million.</p>
 </li>
 </ul>
-
 <h2>
 <a id="configurations-for-optimizing-carbondata-performance" class="anchor" 
href="#configurations-for-optimizing-carbondata-performance" 
aria-hidden="true"><span aria-hidden="true" class="octicon 
octicon-link"></span></a>Configurations for Optimizing CarbonData 
Performance</h2>
-
-<p>Recently we did some performance POC on CarbonData for Finance and 
telecommunication Field. It involved detailed queries and aggregation 
+<p>Recently we did some performance POC on CarbonData for Finance and 
telecommunication Field. It involved detailed queries and aggregation
 scenarios. After the completion of POC, some of the configurations impacting 
the performance have been identified and tabulated below :</p>
-
 <table>
 <thead>
 <tr>

[1/3] incubator-carbondata-site git commit: update dml document

Reply via email to