[kylin] branch document updated: Add Quick Start EN version and fix format

nic Wed, 22 Apr 2020 01:37:14 -0700

This is an automated email from the ASF dual-hosted git repository.

nic pushed a commit to branch document
in repository https://gitbox.apache.org/repos/asf/kylin.git



The following commit(s) were added to refs/heads/document by this push:
     new 3f61225  Add Quick Start EN version and fix format
3f61225 is described below

commit 3f61225c8ff10e1fc85a73c4cd2d79956594f7e1
Author: yaqian.zhang <598593...@qq.com>
AuthorDate: Tue Apr 14 15:33:32 2020 +0800

    Add Quick Start EN version and fix format
---
 website/_data/docs.yml                        |   1 +
 website/_docs/gettingstarted/quickstart.cn.md |  30 +--
 website/_docs/gettingstarted/quickstart.md    | 287 ++++++++++++++++++++++++++
 3 files changed, 303 insertions(+), 15 deletions(-)

diff --git a/website/_data/docs.yml b/website/_data/docs.yml
index 88c0a4d..69c642f 100644
--- a/website/_data/docs.yml
+++ b/website/_data/docs.yml
@@ -24,6 +24,7 @@
   - gettingstarted/faq
   - gettingstarted/events
   - gettingstarted/best_practices
+  - gettingstarted/kylin-quickstart
 
 - title: Installation
   docs:
diff --git a/website/_docs/gettingstarted/quickstart.cn.md 
b/website/_docs/gettingstarted/quickstart.cn.md
index d9d9421..6aa0853 100644
--- a/website/_docs/gettingstarted/quickstart.cn.md
+++ b/website/_docs/gettingstarted/quickstart.cn.md
@@ -62,7 +62,7 @@ apachekylin/apache-kylin-standalone:3.0.1
 
 并自动运行 $KYLIN_HOME/bin/sample.sh及在 Kafka 中创建 kylin_streaming_topic topic 并持续向该 
topic 中发送数据。这是为了让用户启动容器后，就能体验以批和流的方式的方式构建 Cube 并进行查询。
 用户可以通过docker exec命令进入容器，容器内相关环境变量如下：
-```$xslt
+```
 JAVA_HOME=/home/admin/jdk1.8.0_141
 HADOOP_HOME=/home/admin/hadoop-2.7.0
 KAFKA_HOME=/home/admin/kafka_2.11-1.1.1
@@ -99,7 +99,7 @@ CentOS 6.5+ 或Ubuntu 16.0.4+
 
 #### step1、下载kylin压缩包
 
-从https://kylin.apache.org/download/下载一个适用于你的Hadoop版本的二进制文件。目前最新Release版本是kylin 
3.0.1和kylin 2.6.5，其中3.0版本支持实时摄入数据进行预计算的功能。以CDH 5.的hadoop环境为例，可以使用如下命令行下载kylin 
3.0.0：
+从[Apache Kylin Download 
Site](https://kylin.apache.org/download/)下载一个适用于你的Hadoop版本的二进制文件。目前最新Release版本是kylin
 3.0.1和kylin 2.6.5，其中3.0版本支持实时摄入数据进行预计算的功能。以CDH 5.的hadoop环境为例，可以使用如下命令行下载kylin 
3.0.0：
 ```
 cd /usr/local/
 wget 
http://apache.website-solution.net/kylin/apache-kylin-3.0.0/apache-kylin-3.0.0-bin-cdh57.tar.gz
@@ -107,7 +107,7 @@ wget 
http://apache.website-solution.net/kylin/apache-kylin-3.0.0/apache-kylin-3.
 #### step2、解压kylin
 
 解压下载得到的kylin压缩包，并配置环境变量KYLIN_HOME指向解压目录：
-```$xslt
+```
 tar -zxvf  apache-kylin-3.0.0-bin-cdh57.tar.gz
 cd apache-kylin-3.0.0-bin-cdh57
 export KYLIN_HOME=`pwd`
@@ -116,11 +116,11 @@ export KYLIN_HOME=`pwd`
 #### step3、下载SPARK
 
 由于kylin启动时会对SPARK环境进行检查，所以你需要设置SPARK_HOME指向自己的spark安装路径：
-```$xslt
+```
 export SPARK_HOME=/path/to/spark
 ```
 如果您没有已经下载好的Spark环境，也可以使用kylin自带脚本下载spark:
-```$xslt
+```
 $KYLIN_HOME/bin/download-spark.sh
 ```
 
脚本会将解压好的spark放在$KYLIN_HOME目录下，如果系统中没有设置SPARK_HOME，启动kylin时会自动找到$KYLIN_HOME目录下的spark。
@@ -128,7 +128,7 @@ $KYLIN_HOME/bin/download-spark.sh
 #### step4、环境检查
 
 Kylin 运行在 Hadoop 集群上，对各个组件的版本、访问权限及 CLASSPATH 等都有一定的要求，为了避免遇到各种环境问题，您可以执行
-```$xslt
+```
 $KYLIN_HOME/bin/check-env.sh
 ```
 来进行环境检测，如果您的环境存在任何的问题，脚本将打印出详细报错信息。如果没有报错信息，代表您的环境适合 Kylin 运行。
@@ -136,11 +136,11 @@ $KYLIN_HOME/bin/check-env.sh
 #### step5、启动kylin
 
 运行如下命令来启动kylin：
-```$xslt
+```
 $KYLIN_HOME/bin/kylin.sh start 
 ```
 如果启动成功，命令行的末尾会输出如下内容：
-```$xslt
+```
 A new Kylin instance is started by root. To stop it, run 'kylin.sh stop'
 Check the log at /usr/local/apache-kylin-3.0.0-bin-cdh57/logs/kylin.log
 Web UI is at http://<hostname>:7070/kylin
@@ -157,22 +157,22 @@ Kylin 启动后您可以通过浏览器 http://<hostname>:port/kylin 进行访
 
 Kylin提供了一个创建样例Cube的脚本，以供用户快速体验Kylin。
 在命令行运行：
-```$xslt
+```
 $KYLIN_HOME/bin/sample.sh
 ```
 完成后登陆kylin，点击System->Configuration->Reload Metadata来重载元数据
 
元数据重载完成后你可以在左上角的Project中看到一个名为learn_kylin的项目，它包含kylin_sales_cube和kylin_streaming_cube,
 它们分别为batch cube和streaming cube，你可以直接对kylin_sales_cube进行构建，构建完成后就可以查询。
 对于kylin_streaming_cube，需要设置KAFKA_HOME指向你的kafka安装目录:
-```$xslt
+```
 export KAFKA_HOME=/path/to/kafka
 ```
 然后执行
-```$xslt
+```
 ${KYLIN_HOME}/bin/sample-streaming.sh
 ```
 该脚本会在 localhost:9092 broker 中创建名为 kylin_streaming_topic 的 Kafka 
Topic，它也会每秒随机发送 100 条 messages 到 
kylin_streaming_topic，然后你可以对kylin_streaming_cube进行构建。
 
-关于sample cube，可以参考http://kylin.apache.org/cn/docs/tutorial/kylin_sample.html。
+关于sample cube，可以参考[Sample Cube](/cn/docs/tutorial/kylin_sample.html)。
 
 当然，你也可以根据下面的教程来尝试创建自己的Cube。
 
@@ -233,13 +233,13 @@ Kylin会读取到Hive数据源中的表并以树状方式显示出来，你可
 
 点击Next跳转到下一页高级设置。在这里可以设置聚合组、RowKeys、Mandatory Cuboids、Cube Engine等。
 
-关于高级设置的详细信息，可以参考http://kylin.apache.org/cn/docs/tutorial/create_cube.html 
页面中的步骤5，其中对聚合组等设置进行了详细介绍。
+关于高级设置的详细信息，可以参考[create_cube](/cn/docs/tutorial/create_cube.html) 
页面中的步骤5，其中对聚合组等设置进行了详细介绍。
 
-关于更多维度优化，可以阅读http://kylin.apache.org/blog/2016/02/18/new-aggregation-group/。 
+关于更多维度优化，可以阅读[aggregation-group](/blog/2016/02/18/new-aggregation-group/)。 
 
 ![](/images/docs/quickstart/advance_setting.png)
 
-对于高级设置不是很熟悉时可以先保持默认设置，点击Next跳转到Kylin 
Properties页面，你可以在这里重写cube级别的kylin配置项，定义覆盖的属性，配置项请参考：http://kylin.apache.org/cn/docs/install/configuration.html。
+对于高级设置不是很熟悉时可以先保持默认设置，点击Next跳转到Kylin 
Properties页面，你可以在这里重写cube级别的kylin配置项，定义覆盖的属性，配置项请参考[配置项](/cn/docs/install/configuration.html)。
 
 ![](/images/docs/quickstart/properties.png)
 
diff --git a/website/_docs/gettingstarted/quickstart.md 
b/website/_docs/gettingstarted/quickstart.md
new file mode 100644
index 0000000..6a85427
--- /dev/null
+++ b/website/_docs/gettingstarted/quickstart.md
@@ -0,0 +1,287 @@
+---
+layout: docs-cn
+title:  Quick Start
+categories: start
+permalink: /docs/gettingstarted/kylin-quickstart.html
+since: v0.6.x
+---
+
+This guide aims to provide novice Kylin users with a complete process guide 
from download and installation to a sub-second query experience. The guide is 
divided into two parts, which respectively introduce the installation and use 
of Kylin in two scenarios – with an installation based on a Hadoop environment 
and installation from Docker image without a Hadoop environment.
+
+Users can follow these steps to get an initial understanding of how to use 
Kylin, master the basic skills of Kylin and then use Kylin to design models and 
speed up queries based on their own business scenarios.
+
+### 01 Install Kylin From a Docker Image
+
+In order to make it easy for users to try out Kylin, Zhu Weibin of Ant 
Financial has contributed “Kylin Docker Image” to the community. In this image, 
various services that Kylin depends on have been installed and deployed, 
including:
+
+- Jdk 1.8
+- Hadoop 2.7.0
+- Hive 1.2.1
+- Hbase 1.1.2
+- Spark 2.3.1
+- Zookeeper 3.4.6
+- Kafka 1.1.1
+- Mysql
+- Maven 3.6.1
+
+We have uploaded the user facing Kylin image to the Docker repository. Users 
do not need to build the image locally; they only need to install Docker to 
experience Kylin’s one-click installation.
+
+#### Step1
+First, execute the following command to pull the image from the Docker 
repository:
+```
+docker pull apachekylin/apache-kylin-standalone:3.0.1
+```
+The image here contains the latest version of Kylin: Kylin v3.0.1. This image 
contains all of the big data components that Kylin depends on, so it takes a 
long time to pull the image – please be patient. After the pull is successful, 
it is displayed as follows:
+
+![](/images/docs/quickstart/pull_docker.png)
+
+#### Step2
+Execute the following command to start the container:
+```
+docker run -d \
+-m 8G \
+-p 7070:7070 \
+-p 8088:8088 \
+-p 50070:50070 \
+-p 8032:8032 \
+-p 8042:8042 \
+-p 16010:16010 \
+apachekylin/apache-kylin-standalone:3.0.1
+```
+The container will start shortly. Since the specified port in the container 
has been mapped to the local port, you can directly open the pages of each 
service in the local browser, such as:
+- Kylin Page: http://127.0.0.1:7070/kylin/
+- HDFS NameNode Page: http://127.0.0.1:50070
+- Yarn ResourceManager Page: http://127.0.0.1:8088
+- HBase Page: http://127.0.0.1:60010
+
+When the container starts, the following services are automatically started:
+- NameNode, DataNode
+- ResourceManager, NodeManager
+- HBase
+- Kafka
+- Kylin
+
+It will also automatically run $ KYLIN_HOME / bin / sample.sh and create a 
kylin_streaming_topic in Kafka and continue to send data to that topic to allow 
users to experience building and querying cubes in batches and streams as soon 
as the container is launched.
+
+Users can enter the container through the docker exec command. The relevant 
environment variables in the container are as follows:
+- JAVA_HOME = / home / admin / jdk1.8.0_141
+- HADOOP_HOME = / home / admin / hadoop-2.7.0
+- KAFKA_HOME = / home / admin / kafka_2.11-1.1.1
+- SPARK_HOME = / home / admin / spark-2.3.1-bin-hadoop2.6
+- HBASE_HOME = / home / admin / hbase-1.1.2
+- HIVE_HOME = / home / admin / apache-hive-1.2.1-bin
+- KYLIN_HOME = / home / admin / apache-kylin-3.0.0-alpha2-bin-hbase1x
+
+After logging in to Kylin with user/password of ADMIN/KYLIN, users can use the 
sample cube to experience the construction and query of the cube, or they can 
create and query their own models and cubes by following the tutorial from Step 
8 in “Install and Use Kylin Based on a Hadoop Environment” below.
+
+### 02 Install and Use Kylin Based on a Hadoop Environment
+
+Users who already have a stable Hadoop environment can download Kylin’s binary 
package and deploy it on their Hadoop cluster. Before installation, check the 
environment according to the following requirements.
+
+#### Environmental Inspection
+
+(1) Pre-Conditions: Kylin relies on a Hadoop cluster to process large data 
sets. You need to prepare a Hadoop cluster configured with HDFS, YARN, 
MapReduce, Hive, HBase, Zookeeper and other services for Kylin to run.
+
+Kylin can be started on any node of a Hadoop cluster. For your convenience, 
you can run Kylin on the master node, but for better stability, we recommend 
that you deploy Kylin on a clean Hadoop client node. The Hive, HBase, HDFS and 
other command lines have been installed on the node and the client 
configuration (such as core-site.xml, hive-site.xml, hbase-site.xml and others) 
have been properly configured and they can automatically synchronize with other 
nodes.
+
+The Linux account running Kylin must have access to the Hadoop cluster, 
including permissions to create/write HDFS folders, Hive tables, HBase tables 
and submit MapReduce tasks.
+
+ 
+
+(2) Hardware Requirements: The server running Kylin is recommended to have a 
minimum configuration of 4 core CPU, 16 GB memory and 100 GB disk.
+
+ 
+
+(3) Operating System Requirements: CentOS 6.5+ or Ubuntu 16.0.4+
+
+ 
+
+(4) Software Requirements: Hadoop 2.7+, 3.0-3.1; Hive 0.13+, 1.2.1+; HBase 
1.1+, 2.0 (supported since Kylin 2.5); JDK: 1.8+
+
+ 
+
+It is recommended to use an integrated Hadoop environment for Kylin 
installation and testing, such as Hortonworks HDP or Cloudera CDH. Before Kylin 
was released, Hortonworks HDP 2.2-2.6 and 3.0, Cloudera CDH 5.7-5.11 and 6.0, 
AWS EMR 5.7-5.10, and Azure HDInsight 3.5-3.6 passed the test.
+
+#### Install and Use
+When your environment meets the above prerequisites, you can install and start 
using Kylin.
+
+#### Step1. Download the Kylin Archive
+Download a binary for your version of Hadoop from [Apache Kylin Download 
Site](https://kylin.apache.org/download/). Currently, the latest versions are 
Kylin 3.0.1 and Kylin 2.6.5, of which, version 3.0 supports the function of 
ingesting data in real time for pre-calculation. If your Hadoop environment is 
CDH 5.7, you can download Kylin 3.0.0 using the following command line:
+```
+cd /usr/local/
+wget 
http://apache.website-solution.net/kylin/apache-kylin-3.0.0/apache-kylin-3.0.0-bin-cdh57.tar.gz
+```
+
+#### Step2. Extract Kylin
+Extract the downloaded Kylin archive and configure the environment variable 
KYLIN_HOME to point to the extracted directory:
+```
+tar -zxvf  apache-kylin-3.0.0-bin-cdh57.tar.gz
+cd apache-kylin-3.0.0-bin-cdh57
+export KYLIN_HOME=`pwd`
+```
+
+#### Step3. Download Spark
+Since Kylin checks the Spark environment when it starts, you need to set 
SPARK_HOME:
+```
+export SPARK_HOME=/path/to/spark
+```
+If you don’t have a Spark environment already downloaded, you can also 
download Spark using Kylin’s own script:
+```
+$KYLIN_HOME/bin/download-spark.sh
+```
+The script will place the decompressed Spark in the $ KYLIN_HOME directory. If 
SPARK_HOME is not set in the system, the Spark in the $ KYLIN_HOME directory 
will be found automatically when Kylin is started.
+
+#### Step4. Environmental Inspection
+Kylin runs on a Hadoop cluster and has certain requirements for the version, 
access permissions and CLASSPATH of each component. 
+To avoid encountering various environmental problems, you can run the $ 
KYLIN_HOME / bin / check-env.sh script to perform an environment check to see 
if there are any problems. 
+The script will print out detailed error messages if any errors are 
identified. If there is no error message, your environment is suitable for 
Kylin operation.
+
+#### Step5. Start Kylin
+Run 
+```
+$KYLIN_HOME/bin/kylin.sh
+```
+Start script to start Kylin. If the startup is successful, the following will 
be output at the end of the command line:
+```
+A new Kylin instance is started by root. To stop it, run 'kylin.sh stop'
+Check the log at /usr/local/apache-kylin-3.0.0-bin-cdh57/logs/kylin.log
+Web UI is at http://<hostname>:7070/kylin
+```
+The default port started by Kylin is 7070. You can use $ 
KYLIN_HOME/bin/kylin-port-replace-util.sh set number to modify the port. The 
modified port is 7070 + number.
+
+#### Step6. Visit Kylin
+After Kylin starts, you can access it through your browser: 
http://<hostname>:port/kylin – where <hostname> is the specific machine name, 
IP address or domain name, port is the Kylin port and the default is 7070. 
+The initial username and password are ADMIN/KYLIN. After the server starts, 
you can get the runtime log by looking at $ KYLIN_HOME/logs/kylin.log.
+
+#### Step7. Create Sample Cube
+Kylin provides a script to create a sample cube for users to quickly 
experience Kylin. Run from the command line:
+```
+$KYLIN_HOME/bin/sample.sh
+```
+After completing, log in to Kylin, click System -> Configuration -> Reload 
Metadata to reload the metadata.
+
+After the metadata is reloaded, you can see a project named learn_kylin in 
Project in the upper left corner. 
+This contains kylin_sales_cube and kylin_streaming_cube, which are a batch 
cube and a streaming cube, respectively. 
+You can build the kylin_sales_cube directly and you can query it after the 
build is completed. 
+For kylin_streaming_cube, you need to set KAFKA_HOME and then execute $ 
{KYLIN_HOME} /bin/sample-streaming.sh. 
+This script will create a Kafka Topic named kylin_streaming_topic in the 
localhost: 9092 broker and it will also randomly send 100 messages to 
kylin_streaming_topic, then you can build kylin_streaming_cube.
+
+For sample cube, you can refer to:[Sample 
Cube](/docs/tutorial/kylin_sample.html)
+
+Of course, you can also try to create your own cube based on the following 
tutorial.
+
+#### Step8. Create Project 
+After logging in to Kylin, click the + in the upper left corner to create a 
Project.
+
+![](/images/docs/quickstart/create_project.png)
+
+#### Step9. Load Hive Table
+Click Model -> the Data Source -> the Load the From the Table Tree. 
+Kylin reads the Hive data source table and displays it in a tree. You can 
choose the tables you would like to add to models and then click Sync. The 
selected tables will then be loaded into Kylin.
+
+![](/images/docs/quickstart/load_hive_table.png)
+
+They then appear in the Tables directory of the data source.
+
+#### Step10. Create the Model
+Click Model -> New -> New Model:
+
+![](/images/docs/quickstart/create_model.png)
+
+Enter the Model Name and click Next, then select Fact Table and Lookup Table. 
You need to set the JOIN condition with the fact table when adding Lookup Table.
+
+![](/images/docs/quickstart/add_lookup_table.png)
+
+Then click Next to select Dimension:
+
+![](/images/docs/quickstart/model_add_dimension.png)
+
+Next, Select Measure:
+
+![](/images/docs/quickstart/model_add_measure.png)
+
+The next step is to set the time partition column and filter conditions. The 
time partition column is used to select the time range during incremental 
construction. If no time partition column is set, it means that the cubes under 
this model are all built. The filter condition is used for the where condition 
when flattening the table.
+
+![](/images/docs/quickstart/set_partition_column.png)
+
+Then, click Save to save the model.
+
+#### Step11. Create Cube
+
+Model -> New -> New Cube:
+
+![](/images/docs/quickstart/create_cube.png)
+
+Click Next to add Dimension. The dimensions of the Lookup Table can be set to 
Normal or Derived. The default setting is derived dimension. Derived dimension 
means that the column can be derived from the primary key of the dimension 
table. In fact, only the primary key column will be calculated by the cube.
+
+![](/images/docs/quickstart/cube_add_dimension.png)
+
+Click Next and click + Measure to add a pre-calculated measure.
+
+Kylin creates a Count (1) metric by default. Kylin supports eight metrics: 
SUM, MIN, MAX, COUNT, COUNT_DISTINCT, TOP_N, EXTENDED_COLUMN and PERCENTILE.
+
+Please select the appropriate return type for COUNT_DISTINCT and TOP_N, which 
is related to the size of the cube. 
+Click OK after the addition is complete and the measure will be displayed in 
the Measures list.
+
+![](/images/docs/quickstart/cube_add_measure.png)
+
+After adding all of the measures, click Next to proceed. This page is about 
the settings for cube data refresh. 
+Here you can set the threshold for automatic merge (Auto Merge Thresholds), 
the minimum time for data retention (Retention Threshold) and the start time of 
the first segment.
+
+![](/images/docs/quickstart/segment_auto_merge.png)
+
+Click Next to continue going through the Advanced Settings. 
+Here you can set the aggregation group, RowKeys, Mandatory Cuboids, Cube 
Engine, etc.
+ 
+For more information about Advanced Settings, you can refer to Step 5 on the 
[create_cube](/docs/tutorial/create_cube.html), which details the settings for 
additional options. 
+
+For more dimensional optimization, you can read: 
[aggregation-group](/blog/2016/02/18/new-aggregation-group/).
+
+![](/images/docs/quickstart/advance_setting.png)
+
+If you are not familiar with Advanced Settings, you can keep the default 
settings first. Click Next to jump to the Kylin Properties page. Here you can 
override the cube-level Kylin configuration items and define the properties to 
be covered. 
+For configuration items, please refer to: 
[configuration](/docs/install/configuration.html).
+
+![](/images/docs/quickstart/properties.png)
+
+After the configuration is complete, click the Next button to the next page. 
+Here you can preview the basic information of the cube you are creating and 
you can return to the previous steps to modify it. 
+If you don’t need to make any changes, you can click the Save button to 
complete the cube creation. 
+After that, this cube will appear in your cube list.
+
+![](/images/docs/quickstart/cube_list.png)
+
+#### Step12. Build Cube
+
+The cube created in the previous step has definitions but no calculated data. 
Its status is “DISABLED” and it cannot be queried. If you want the cube to have 
data, you need to build it. Cubes are usually built in one of two ways: full 
builds or incremental builds.
+
+Click the Action under the Actions column of the cube to be expanded. Select 
Build. 
+
+If the time partition column is not set in the model to which the cube 
belongs, the default is to build in full.
+
+Click Submit to submit the build task directly. If a time partition column is 
set, the following page will appear, where you will need to select the start 
and end time for building the data.
+
+![](/images/docs/quickstart/cube_build.png)
+
+After setting the start and end time, click Submit to submit the build task. 
+You can then observe the status of the build task on the Monitor page. 
+Kylin displays the running status of each step on the page, the output log and 
MapReduce tasks. 
+You can view more detailed log information in ${KYLIN_HOME}/logs/kylin.log.
+
+![](/images/docs/quickstart/job_monitor.png)
+
+After the job is built, the status of the cube will change to READY and you 
can see the segment information.
+
+![](/images/docs/quickstart/segment_info.png)
+
+#### Step13. Query Cube
+After the cube is built, you can see the table of the built cube and query it 
under the Tables list on the Insight page. 
+After the query hits the cube, it returns the pre-calculated results stored in 
HBase.
+
+![](/images/docs/quickstart/query_cube.png)
+
+Congratulations, you have already acquired the basic skills for using Kylin 
and you can now discover and explore more and more powerful functions.
+
+
+
+

[kylin] branch document updated: Add Quick Start EN version and fix format

Reply via email to