Author: lidong
Date: Sat Jul 22 01:40:55 2017
New Revision: 1802649
URL: http://svn.apache.org/viewvc?rev=1802649&view=rev
Log:
Add kaisen's blog
Added:
kylin/site/2017/
kylin/site/2017/07/
kylin/site/2017/07/21/
kylin/site/2017/07/21/Improving-Spark-Cubing/
kylin/site/2017/07/21/Improving-Spark-Cubing/index.html
Modified:
kylin/site/blog/index.html
kylin/site/docs20/tutorial/cube_spark.html
kylin/site/feed.xml
Added: kylin/site/2017/07/21/Improving-Spark-Cubing/index.html
URL:
http://svn.apache.org/viewvc/kylin/site/2017/07/21/Improving-Spark-Cubing/index.html?rev=1802649&view=auto
==============================================================================
--- kylin/site/2017/07/21/Improving-Spark-Cubing/index.html (added)
+++ kylin/site/2017/07/21/Improving-Spark-Cubing/index.html Sat Jul 22 01:40:55
2017
@@ -0,0 +1,151 @@
+<h1 id="improving-spark-cubing-in-kylin-20">Improving Spark Cubing in Kylin
2.0</h1>
+
+<p>Author: Kaisen Kang</p>
+
+<hr />
+
+<p>Apache Kylin is a OALP Engine that speeding up query by Cube
precomputation. The Cube is multi-dimensional dataset which contain precomputed
all measures in all dimension combinations. Before v2.0, Kylin uses MapReduce
to build Cube. In order to get better performance, Kylin 2.0 introduced the
Spark Cubing. About the principle of Spark Cubing, please refer to the article
<a
href="http://kylin.apache.org/blog/2017/02/23/by-layer-spark-cubing/">By-layer
Spark Cubing</a>.</p>
+
+<p>In this blog, I will talk about the following topics:</p>
+
+<ul>
+ <li>How to make Spark Cubing support HBase cluster with Kerberos enabled</li>
+ <li>Spark configurations for Cubing</li>
+ <li>Performance of Spark Cubing</li>
+ <li>Pros and cons of Spark Cubing</li>
+ <li>Applicable scenarios of Spark Cubing</li>
+ <li>Improvement for dictionary loading in Spark Cubing</li>
+</ul>
+
+<p>In currently Spark Cubing(2.0) version, it doesnât support HBase cluster
using Kerberos bacause Spark Cubing need to get matadata from HBase. To solve
this problem, we have two solutions: one is to make Spark could connect HBase
with Kerberos, the other is to avoid Spark connect to HBase in Spark Cubing.</p>
+
+<h3 id="make-spark-connect-hbase-with-kerberos-enabled">Make Spark connect
HBase with Kerberos enabled</h3>
+<p>If just want to run Spark Cubing in Yarn client mode, we only need to add
three line code before new SparkConf() in SparkCubingByLayer:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>
Configuration configuration = HBaseConnection.getCurrentHBaseConfiguration();
+ HConnection connection =
HConnectionManager.createConnection(configuration);
+ //Obtain an authentication token for the given user and add it to the
user's credentials.
+ TokenUtil.obtainAndCacheToken(connection,
UserProvider.instantiate(configuration).create(UserGroupInformation.getCurrentUser()));
+</code></pre>
+</div>
+
+<p>As for How to make Spark connect HBase using Kerberos in Yarn cluster mode,
please refer to SPARK-6918, SPARK-12279, and HBASE-17040. The solution may
work, but not elegant. So I tried the sencond solution.</p>
+
+<h3 id="use-hdfs-metastore-for-spark-cubing">Use HDFS metastore for Spark
Cubing</h3>
+
+<p>The core idea here is uploading the necessary metadata job related to HDFS
and using HDFSResourceStore manage the metadata.</p>
+
+<p>Before introducing how to use HDFSResourceStore instead of
HBaseResourceStore in Spark Cubing. Letâs see whatâs Kylin metadata format
and how Kylin manages the metadata.</p>
+
+<p>Every concrete metadata for table, cube, model and project is a JSON file
in Kylin. The whole metadata is organized by file directory. The picture below
is the root directory for Kylin metadata,<br />
+<img
src="http://static.zybuluo.com/kangkaisen/t1tc6neiaebiyfoir4fdhs11/%E5%B1%8F%E5%B9%95%E5%BF%AB%E7%85%A7%202017-07-02%20%E4%B8%8B%E5%8D%883.51.43.png"
alt="å±å¹å¿«ç
§ 2017-07-02 ä¸å3.51.43.png-20.7kB" /><br />
+This following picture shows the content of project dir, the âlearn_kylinâ
and âkylin_testâ are both project names.<br />
+<img
src="http://static.zybuluo.com/kangkaisen/4dtiioqnw08w6vtj0r9u5f27/%E5%B1%8F%E5%B9%95%E5%BF%AB%E7%85%A7%202017-07-02%20%E4%B8%8B%E5%8D%883.54.59.png"
alt="å±å¹å¿«ç
§ 2017-07-02 ä¸å3.54.59.png-11.8kB" /></p>
+
+<p>Kylin manage the metadata using ResourceStore, ResourceStore is a abstract
class, which abstract the CRUD Interface for metadata. ResourceStore has three
implementation classesï¼</p>
+
+<ul>
+ <li>FileResourceStore (store with Local FileSystem)</li>
+ <li>HDFSResourceStore</li>
+ <li>HBaseResourceStore</li>
+</ul>
+
+<p>Currently, only HBaseResourceStore could use in production env.
FileResourceStore mainly used for testing. HDFSResourceStore doesnât support
massive concurrent write, but it is ideal to use for read only scenario like
Cubing. Kylin use the âkylin.metadata.urlâ config to decide which kind of
ResourceStore will be used.</p>
+
+<p>Now, Letâs see How to use HDFSResourceStore instead of HBaseResourceStore
in Spark Cubing.</p>
+
+<ol>
+ <li>Determine the necessary metadata for Spark Cubing job</li>
+ <li>Dump the necessary metadata from HBase to local</li>
+ <li>Update the kylin.metadata.url and then write all Kylin config to
âkylin.propertiesâ file in local metadata dir.</li>
+ <li>Use ResourceTool upload the local metadata to HDFS.</li>
+ <li>Construct the HDFSResourceStore from the HDFS âkylin.propertiesâ
file in Spark executor.</li>
+</ol>
+
+<p>Of course, We need to delete the HDFS metadata dir on complete. Iâm
working on a patch for this, please watch KYLIN-2653 for update.</p>
+
+<h3 id="spark-configurations-for-cubing">Spark configurations for Cubing</h3>
+
+<p>Following is the Spark configuration I used in our environment. It enables
Spark dynamic resource allocation; the goal is to let our user set less Spark
configurations.</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>//running in
yarn-cluster mode
+kylin.engine.spark-conf.spark.master=yarn
+kylin.engine.spark-conf.spark.submit.deployMode=cluster
+
+//enable the dynamic allocation for Spark to avoid user set the number of
executors explicitly
+kylin.engine.spark-conf.spark.dynamicAllocation.enabled=true
+kylin.engine.spark-conf.spark.dynamicAllocation.minExecutors=10
+kylin.engine.spark-conf.spark.dynamicAllocation.maxExecutors=1024
+kylin.engine.spark-conf.spark.dynamicAllocation.executorIdleTimeout=300
+kylin.engine.spark-conf.spark.shuffle.service.enabled=true
+kylin.engine.spark-conf.spark.shuffle.service.port=7337
+
+//the memory config
+kylin.engine.spark-conf.spark.driver.memory=4G
+//should enlarge the executor.memory when the cube dict is huge
+kylin.engine.spark-conf.spark.executor.memory=4G
+//because kylin need to load the cube dict in executor
+kylin.engine.spark-conf.spark.executor.cores=1
+
+//enlarge the timeout
+kylin.engine.spark-conf.spark.network.timeout=600
+
+kylin.engine.spark-conf.spark.yarn.queue=root.hadoop.test
+
+kylin.engine.spark.rdd-partition-cut-mb=100
+</code></pre>
+</div>
+
+<h3 id="performance-test-of-spark-cubing">Performance test of Spark Cubing</h3>
+
+<p>For the source data scale from millions to hundreds of millions, my test
result is consistent with the blog <a
href="http://kylin.apache.org/blog/2017/02/23/by-layer-spark-cubing/">By-layer
Spark Cubing</a>. The improvement is remarkable. Moreover, I also tested with
billions of source data and having huge dictionary specially.</p>
+
+<p>The test Cube1 has 2.7 billion source data, 9 dimensions, one precise
distinct count measure having 70 million cardinality (which means the dict also
has 70 million cardinality).</p>
+
+<p>Test test Cube2 has 2.4 billion source data, 13 dimensions, 38
measures(contains 9 precise distinct count measures).</p>
+
+<p>The test result is shown in below picture, the unit of time is minute.<br />
+<img
src="http://static.zybuluo.com/kangkaisen/1urzfkal8od52fodi1l6u0y5/image.png"
alt="image.png-38.1kB" /></p>
+
+<p>In one word, <strong>Spark Cubing is much faster than MR cubing in most
scenes</strong>.</p>
+
+<h3 id="pros-and-cons-of-spark-cubing">Pros and Cons of Spark Cubing</h3>
+<p>In my opinion, the advantage for Spark Cubing includes:</p>
+
+<ol>
+ <li>Because of the RDD cache, Spark Cubing could take full advantage of
memory to avoid disk I/O.</li>
+ <li>When we have enough memory resource, Spark Cubing could use more memory
resource to get better build performance.</li>
+</ol>
+
+<p>On the contraryï¼the drawback for Spark Cubing includes:</p>
+
+<ol>
+ <li>Spark Cubing couldnât handle huge dictionary well (hundreds of
millions of cardinality);</li>
+ <li>Spark Cubing isnât stable enough for very large scale data.</li>
+</ol>
+
+<h3 id="applicable-scenarios-of-spark-cubing">Applicable scenarios of Spark
Cubing</h3>
+<p>In my opinion, except the huge dictionary scenario, we all could use Spark
Cubing to replace MR Cubing, especially under the following scenarios:</p>
+
+<ol>
+ <li>Many dimensions</li>
+ <li>Normal dictionaries (e.g, cardinality < 1 hundred millions)</li>
+ <li>Normal scale data (e.g, less than 10 billion rows to build at once).</li>
+</ol>
+
+<h3 id="improvement-for-dictionary-loading-in-spark-cubing">Improvement for
dictionary loading in Spark Cubing</h3>
+
+<p>As we all known, a big difference for MR and Spark is, the task for MR is
running in process, but the task for Spark is running in thread. So, in MR
Cubing, the dict of Cube only load once, but in Spark Cubing, the dict will be
loaded many times in one executor, which will cause frequent GC.</p>
+
+<p>So, I made the two improvements:</p>
+
+<ol>
+ <li>Only load the dict once in one executor.</li>
+ <li>Add maximumSize for LoadingCache in the AppendTrieDictionary to make the
dict removed as early as possible.</li>
+</ol>
+
+<p>These two improvements have been contributed into Kylin repository.</p>
+
+<h3 id="summary">Summary</h3>
+<p>Spark Cubing is a great feature for Kylin 2.0, Thanks Kylin community. We
will apply Spark Cubing in real scenarios in our company. I believe Spark
Cubing will be more robust and efficient in the future releases.</p>
+
Modified: kylin/site/blog/index.html
URL:
http://svn.apache.org/viewvc/kylin/site/blog/index.html?rev=1802649&r1=1802648&r2=1802649&view=diff
==============================================================================
--- kylin/site/blog/index.html (original)
+++ kylin/site/blog/index.html Sat Jul 22 01:40:55 2017
@@ -283,25 +283,25 @@
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link" href="/blog/2016/05/26/release-v1.5.2/">Apache
Kylin v1.5.2 Release Announcement</a></h2><div align="left"
class="post-meta">posted: May 26, 2016</div>
+ <a class="post-link"
href="/cn/blog/2016/05/26/release-v1.5.2/">Apache Kylin v1.5.2
æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: May 26,
2016</div>
</li>
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link"
href="/cn/blog/2016/05/26/release-v1.5.2/">Apache Kylin v1.5.2
æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: May 26,
2016</div>
+ <a class="post-link" href="/blog/2016/05/26/release-v1.5.2/">Apache
Kylin v1.5.2 Release Announcement</a></h2><div align="left"
class="post-meta">posted: May 26, 2016</div>
</li>
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link"
href="/cn/blog/2016/04/12/release-v1.5.1/">Apache Kylin v1.5.1
æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: Apr 12,
2016</div>
+ <a class="post-link" href="/blog/2016/04/12/release-v1.5.1/">Apache
Kylin v1.5.1 Release Announcement</a></h2><div align="left"
class="post-meta">posted: Apr 12, 2016</div>
</li>
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link" href="/blog/2016/04/12/release-v1.5.1/">Apache
Kylin v1.5.1 Release Announcement</a></h2><div align="left"
class="post-meta">posted: Apr 12, 2016</div>
+ <a class="post-link"
href="/cn/blog/2016/04/12/release-v1.5.1/">Apache Kylin v1.5.1
æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: Apr 12,
2016</div>
</li>
@@ -361,13 +361,13 @@
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link" href="/blog/2015/12/23/release-v1.2/">Apache
Kylin v1.2 Release Announcement</a></h2><div align="left"
class="post-meta">posted: Dec 23, 2015</div>
+ <a class="post-link" href="/cn/blog/2015/12/23/release-v1.2/">Apache
Kylin v1.2 æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: Dec
23, 2015</div>
</li>
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link" href="/cn/blog/2015/12/23/release-v1.2/">Apache
Kylin v1.2 æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: Dec
23, 2015</div>
+ <a class="post-link" href="/blog/2015/12/23/release-v1.2/">Apache
Kylin v1.2 Release Announcement</a></h2><div align="left"
class="post-meta">posted: Dec 23, 2015</div>
</li>
Modified: kylin/site/docs20/tutorial/cube_spark.html
URL:
http://svn.apache.org/viewvc/kylin/site/docs20/tutorial/cube_spark.html?rev=1802649&r1=1802648&r2=1802649&view=diff
==============================================================================
--- kylin/site/docs20/tutorial/cube_spark.html (original)
+++ kylin/site/docs20/tutorial/cube_spark.html Sat Jul 22 01:40:55 2017
@@ -2827,7 +2827,7 @@ $KYLIN_HOME/bin/kylin.sh start</code></p
<h2 id="go-further">Go further</h2>
-<p>If youâre a Kylin administrator but new to Spark, suggest you go through
<a href="https://spark.apache.org/docs/1.6.3/">Spark documents</a>, and donât
forget to update the configurations accordingly. Sparkâs performance relies
on Clusterâs memory and CPU resource, while Kylinâs Cube build is a heavy
task when having a complex data model and a huge dataset to build at one time.
If your cluster resource couldnât fulfill, errors like âOutOfMemorryâ
will be thrown in Spark executors, so please use it properly. For Cube which
has UHC dimension, many combinations (e.g, a full cube with more than 12
dimensions), or memory hungry measures (Count Distinct, Top-N), suggest to use
the MapReduce engine. If your Cube model is simple, all measures are
SUM/MIN/MAX/COUNT, source data is small to medium scale, Spark engine would be
a good choice. Besides, Streaming build isnât supported in this engine so far
(KYLIN-2484).</p>
+<p>If youâre a Kylin administrator but new to Spark, suggest you go through
<a href="https://spark.apache.org/docs/1.6.3/">Spark documents</a>, and donât
forget to update the configurations accordingly. You can enable Spark <a
href="https://spark.apache.org/docs/1.6.1/configuration.html#dynamic-allocation">Dynamic
Resource Allocation</a> so that it can auto scale/shrink for different work
load. Sparkâs performance relies on Clusterâs memory and CPU resource,
while Kylinâs Cube build is a heavy task when having a complex data model and
a huge dataset to build at one time. If your cluster resource couldnât
fulfill, errors like âOutOfMemorryâ will be thrown in Spark executors, so
please use it properly. For Cube which has UHC dimension, many combinations
(e.g, a full cube with more than 12 dimensions), or memory hungry measures
(Count Distinct, Top-N), suggest to use the MapReduce engine. If your Cube
model is simple, all measures are SUM/MIN/MAX
/COUNT, source data is small to medium scale, Spark engine would be a good
choice. Besides, Streaming build isnât supported in this engine so far
(KYLIN-2484).</p>
<p>Now the Spark engine is in public beta; If you have any question, comment,
or bug fix, welcome to discuss in [email protected].</p>
Modified: kylin/site/feed.xml
URL:
http://svn.apache.org/viewvc/kylin/site/feed.xml?rev=1802649&r1=1802648&r2=1802649&view=diff
==============================================================================
--- kylin/site/feed.xml (original)
+++ kylin/site/feed.xml Sat Jul 22 01:40:55 2017
@@ -19,11 +19,172 @@
<description>Apache Kylin Home</description>
<link>http://kylin.apache.org/</link>
<atom:link href="http://kylin.apache.org/feed.xml" rel="self"
type="application/rss+xml"/>
- <pubDate>Tue, 27 Jun 2017 02:35:13 -0700</pubDate>
- <lastBuildDate>Tue, 27 Jun 2017 02:35:13 -0700</lastBuildDate>
+ <pubDate>Fri, 21 Jul 2017 18:39:35 -0700</pubDate>
+ <lastBuildDate>Fri, 21 Jul 2017 18:39:35 -0700</lastBuildDate>
<generator>Jekyll v2.5.3</generator>
<item>
+ <title>Improving Spark Cubing</title>
+ <description><h1
id="improving-spark-cubing-in-kylin-20">Improving Spark Cubing in
Kylin 2.0</h1>
+
+<p>Author: Kaisen Kang</p>
+
+<hr />
+
+<p>Apache Kylin is a OALP Engine that speeding up query by Cube
precomputation. The Cube is multi-dimensional dataset which contain precomputed
all measures in all dimension combinations. Before v2.0, Kylin uses MapReduce
to build Cube. In order to get better performance, Kylin 2.0 introduced the
Spark Cubing. About the principle of Spark Cubing, please refer to the article
<a
href="http://kylin.apache.org/blog/2017/02/23/by-layer-spark-cubing/">By-layer
Spark Cubing</a>.</p>
+
+<p>In this blog, I will talk about the following topics:</p>
+
+<ul>
+ <li>How to make Spark Cubing support HBase cluster with Kerberos
enabled</li>
+ <li>Spark configurations for Cubing</li>
+ <li>Performance of Spark Cubing</li>
+ <li>Pros and cons of Spark Cubing</li>
+ <li>Applicable scenarios of Spark Cubing</li>
+ <li>Improvement for dictionary loading in Spark Cubing</li>
+</ul>
+
+<p>In currently Spark Cubing(2.0) version, it doesnât support HBase
cluster using Kerberos bacause Spark Cubing need to get matadata from HBase. To
solve this problem, we have two solutions: one is to make Spark could connect
HBase with Kerberos, the other is to avoid Spark connect to HBase in Spark
Cubing.</p>
+
+<h3 id="make-spark-connect-hbase-with-kerberos-enabled">Make
Spark connect HBase with Kerberos enabled</h3>
+<p>If just want to run Spark Cubing in Yarn client mode, we only need to
add three line code before new SparkConf() in SparkCubingByLayer:</p>
+
+<div class="highlighter-rouge"><pre
class="highlight"><code> Configuration configuration
= HBaseConnection.getCurrentHBaseConfiguration();
+ HConnection connection =
HConnectionManager.createConnection(configuration);
+ //Obtain an authentication token for the given user and add it to the
user's credentials.
+ TokenUtil.obtainAndCacheToken(connection,
UserProvider.instantiate(configuration).create(UserGroupInformation.getCurrentUser()));
+</code></pre>
+</div>
+
+<p>As for How to make Spark connect HBase using Kerberos in Yarn cluster
mode, please refer to SPARK-6918, SPARK-12279, and HBASE-17040. The solution
may work, but not elegant. So I tried the sencond solution.</p>
+
+<h3 id="use-hdfs-metastore-for-spark-cubing">Use HDFS
metastore for Spark Cubing</h3>
+
+<p>The core idea here is uploading the necessary metadata job related to
HDFS and using HDFSResourceStore manage the metadata.</p>
+
+<p>Before introducing how to use HDFSResourceStore instead of
HBaseResourceStore in Spark Cubing. Letâs see whatâs Kylin metadata format
and how Kylin manages the metadata.</p>
+
+<p>Every concrete metadata for table, cube, model and project is a JSON
file in Kylin. The whole metadata is organized by file directory. The picture
below is the root directory for Kylin metadata,<br />
+<img
src="http://static.zybuluo.com/kangkaisen/t1tc6neiaebiyfoir4fdhs11/%E5%B1%8F%E5%B9%95%E5%BF%AB%E7%85%A7%202017-07-02%20%E4%B8%8B%E5%8D%883.51.43.png"
alt="å±å¹å¿«ç
§ 2017-07-02 ä¸å3.51.43.png-20.7kB" /><br
/>
+This following picture shows the content of project dir, the âlearn_kylinâ
and âkylin_testâ are both project names.<br />
+<img
src="http://static.zybuluo.com/kangkaisen/4dtiioqnw08w6vtj0r9u5f27/%E5%B1%8F%E5%B9%95%E5%BF%AB%E7%85%A7%202017-07-02%20%E4%B8%8B%E5%8D%883.54.59.png"
alt="å±å¹å¿«ç
§ 2017-07-02 ä¸å3.54.59.png-11.8kB"
/></p>
+
+<p>Kylin manage the metadata using ResourceStore, ResourceStore is a
abstract class, which abstract the CRUD Interface for metadata. ResourceStore
has three implementation classesï¼</p>
+
+<ul>
+ <li>FileResourceStore (store with Local FileSystem)</li>
+ <li>HDFSResourceStore</li>
+ <li>HBaseResourceStore</li>
+</ul>
+
+<p>Currently, only HBaseResourceStore could use in production env.
FileResourceStore mainly used for testing. HDFSResourceStore doesnât support
massive concurrent write, but it is ideal to use for read only scenario like
Cubing. Kylin use the âkylin.metadata.urlâ config to decide which kind of
ResourceStore will be used.</p>
+
+<p>Now, Letâs see How to use HDFSResourceStore instead of
HBaseResourceStore in Spark Cubing.</p>
+
+<ol>
+ <li>Determine the necessary metadata for Spark Cubing job</li>
+ <li>Dump the necessary metadata from HBase to local</li>
+ <li>Update the kylin.metadata.url and then write all Kylin config to
âkylin.propertiesâ file in local metadata dir.</li>
+ <li>Use ResourceTool upload the local metadata to HDFS.</li>
+ <li>Construct the HDFSResourceStore from the HDFS
âkylin.propertiesâ file in Spark executor.</li>
+</ol>
+
+<p>Of course, We need to delete the HDFS metadata dir on complete. Iâm
working on a patch for this, please watch KYLIN-2653 for update.</p>
+
+<h3 id="spark-configurations-for-cubing">Spark configurations
for Cubing</h3>
+
+<p>Following is the Spark configuration I used in our environment. It
enables Spark dynamic resource allocation; the goal is to let our user set less
Spark configurations.</p>
+
+<div class="highlighter-rouge"><pre
class="highlight"><code>//running in yarn-cluster mode
+kylin.engine.spark-conf.spark.master=yarn
+kylin.engine.spark-conf.spark.submit.deployMode=cluster
+
+//enable the dynamic allocation for Spark to avoid user set the number of
executors explicitly
+kylin.engine.spark-conf.spark.dynamicAllocation.enabled=true
+kylin.engine.spark-conf.spark.dynamicAllocation.minExecutors=10
+kylin.engine.spark-conf.spark.dynamicAllocation.maxExecutors=1024
+kylin.engine.spark-conf.spark.dynamicAllocation.executorIdleTimeout=300
+kylin.engine.spark-conf.spark.shuffle.service.enabled=true
+kylin.engine.spark-conf.spark.shuffle.service.port=7337
+
+//the memory config
+kylin.engine.spark-conf.spark.driver.memory=4G
+//should enlarge the executor.memory when the cube dict is huge
+kylin.engine.spark-conf.spark.executor.memory=4G
+//because kylin need to load the cube dict in executor
+kylin.engine.spark-conf.spark.executor.cores=1
+
+//enlarge the timeout
+kylin.engine.spark-conf.spark.network.timeout=600
+
+kylin.engine.spark-conf.spark.yarn.queue=root.hadoop.test
+
+kylin.engine.spark.rdd-partition-cut-mb=100
+</code></pre>
+</div>
+
+<h3 id="performance-test-of-spark-cubing">Performance test of
Spark Cubing</h3>
+
+<p>For the source data scale from millions to hundreds of millions, my
test result is consistent with the blog <a
href="http://kylin.apache.org/blog/2017/02/23/by-layer-spark-cubing/">By-layer
Spark Cubing</a>. The improvement is remarkable. Moreover, I also tested
with billions of source data and having huge dictionary specially.</p>
+
+<p>The test Cube1 has 2.7 billion source data, 9 dimensions, one precise
distinct count measure having 70 million cardinality (which means the dict also
has 70 million cardinality).</p>
+
+<p>Test test Cube2 has 2.4 billion source data, 13 dimensions, 38
measures(contains 9 precise distinct count measures).</p>
+
+<p>The test result is shown in below picture, the unit of time is
minute.<br />
+<img
src="http://static.zybuluo.com/kangkaisen/1urzfkal8od52fodi1l6u0y5/image.png"
alt="image.png-38.1kB" /></p>
+
+<p>In one word, <strong>Spark Cubing is much faster than MR cubing
in most scenes</strong>.</p>
+
+<h3 id="pros-and-cons-of-spark-cubing">Pros and Cons of Spark
Cubing</h3>
+<p>In my opinion, the advantage for Spark Cubing includes:</p>
+
+<ol>
+ <li>Because of the RDD cache, Spark Cubing could take full advantage
of memory to avoid disk I/O.</li>
+ <li>When we have enough memory resource, Spark Cubing could use more
memory resource to get better build performance.</li>
+</ol>
+
+<p>On the contraryï¼the drawback for Spark Cubing includes:</p>
+
+<ol>
+ <li>Spark Cubing couldnât handle huge dictionary well (hundreds of
millions of cardinality);</li>
+ <li>Spark Cubing isnât stable enough for very large scale
data.</li>
+</ol>
+
+<h3 id="applicable-scenarios-of-spark-cubing">Applicable
scenarios of Spark Cubing</h3>
+<p>In my opinion, except the huge dictionary scenario, we all could use
Spark Cubing to replace MR Cubing, especially under the following
scenarios:</p>
+
+<ol>
+ <li>Many dimensions</li>
+ <li>Normal dictionaries (e.g, cardinality &lt; 1 hundred
millions)</li>
+ <li>Normal scale data (e.g, less than 10 billion rows to build at
once).</li>
+</ol>
+
+<h3
id="improvement-for-dictionary-loading-in-spark-cubing">Improvement
for dictionary loading in Spark Cubing</h3>
+
+<p>As we all known, a big difference for MR and Spark is, the task for
MR is running in process, but the task for Spark is running in thread. So, in
MR Cubing, the dict of Cube only load once, but in Spark Cubing, the dict will
be loaded many times in one executor, which will cause frequent GC.</p>
+
+<p>So, I made the two improvements:</p>
+
+<ol>
+ <li>Only load the dict once in one executor.</li>
+ <li>Add maximumSize for LoadingCache in the AppendTrieDictionary to
make the dict removed as early as possible.</li>
+</ol>
+
+<p>These two improvements have been contributed into Kylin
repository.</p>
+
+<h3 id="summary">Summary</h3>
+<p>Spark Cubing is a great feature for Kylin 2.0, Thanks Kylin
community. We will apply Spark Cubing in real scenarios in our company. I
believe Spark Cubing will be more robust and efficient in the future
releases.</p>
+
+</description>
+ <pubDate>Fri, 21 Jul 2017 00:00:00 -0700</pubDate>
+ <link>http://kylin.apache.org/2017/07/21/Improving-Spark-Cubing/</link>
+ <guid
isPermaLink="true">http://kylin.apache.org/2017/07/21/Improving-Spark-Cubing/</guid>
+
+
+ </item>
+
+ <item>
<title>A new measure for Percentile precalculation</title>
<description><h2
id="introduction">Introduction</h2>
@@ -65,54 +226,54 @@
</item>
<item>
- <title>Apache Kylin v2.0.0 Beta Announcement</title>
- <description><p>The Apache Kylin community is pleased to
announce the <a href="http://kylin.apache.org/download/">v2.0.0
beta package</a> is ready for download and test.</p>
+ <title>Apache Kylin v2.0.0 beta åå¸</title>
+ <description><p>Apache Kylin社åºé常é«å
´å°å®£å¸ <a
href="http://kylin.apache.org/cn/download/">v2.0.0 beta
package</a> å·²ç»å¯ä»¥ä¸è½½å¹¶æµè¯äºã</p>
<ul>
- <li>Download link: <a
href="http://kylin.apache.org/download/">http://kylin.apache.org/download/</a></li>
- <li>Source code:
https://github.com/apache/kylin/tree/kylin-2.0.0-beta</li>
+ <li>ä¸è½½é¾æ¥: <a
href="http://kylin.apache.org/cn/download/">http://kylin.apache.org/cn/download/</a></li>
+ <li>æºä»£ç :
https://github.com/apache/kylin/tree/kylin-2.0.0-beta</li>
</ul>
-<p>It has been more than 2 month since the v1.6.0 release. The community
has been working hard to deliver some long wanted features, hoping to move
Apache Kylin to the next level.</p>
+<p>èªä»v1.6.0çæ¬åå¸å·²ç»2ä¸ªå¤æäºãè¿æ®µæ¶é´éï¼æ´ä¸ªç¤¾åºååå¼å宿äºä¸ç³»åé大çåè½ï¼å¸æè½å°Apache
Kylinæåå°ä¸ä¸ªæ°çé«åº¦ã</p>
<ul>
- <li>Support snowflake data model (<a
href="https://issues.apache.org/jira/browse/KYLIN-1875">KYLIN-1875</a>)</li>
- <li>Support TPC-H queries (<a
href="https://issues.apache.org/jira/browse/KYLIN-2467">KYLIN-2467</a>)</li>
- <li>Spark cubing engine (<a
href="https://issues.apache.org/jira/browse/KYLIN-2331">KYLIN-2331</a>)</li>
- <li>Job engine HA (<a
href="https://issues.apache.org/jira/browse/KYLIN-2006">KYLIN-2006</a>)</li>
- <li>Percentile measure (<a
href="https://issues.apache.org/jira/browse/KYLIN-2396">KYLIN-2396</a>)</li>
- <li>Cloud tested (<a
href="https://issues.apache.org/jira/browse/KYLIN-2351">KYLIN-2351</a>)</li>
+ <li>æ¯æéªè±æ¨¡å (<a
href="https://issues.apache.org/jira/browse/KYLIN-1875">KYLIN-1875</a>)</li>
+ <li>æ¯æ TPC-H æ¥è¯¢ (<a
href="https://issues.apache.org/jira/browse/KYLIN-2467">KYLIN-2467</a>)</li>
+ <li>Spark æå»ºå¼æ (<a
href="https://issues.apache.org/jira/browse/KYLIN-2331">KYLIN-2331</a>)</li>
+ <li>Job Engine é«å¯ç¨æ§ (<a
href="https://issues.apache.org/jira/browse/KYLIN-2006">KYLIN-2006</a>)</li>
+ <li>Percentile 度é (<a
href="https://issues.apache.org/jira/browse/KYLIN-2396">KYLIN-2396</a>)</li>
+ <li>å¨ Cloud ä¸éè¿æµè¯ (<a
href="https://issues.apache.org/jira/browse/KYLIN-2351">KYLIN-2351</a>)</li>
</ul>
-<p>You are very welcome to give the v2.0.0 beta a try, and please do
send feedbacks to <a
href="&#109;&#097;&#105;&#108;&#116;&#111;:&#100;&#101;&#118;&#064;&#107;&#121;&#108;&#105;&#110;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;">&#100;&#101;&#118;&#064;&#107;&#121;&#108;&#105;&#110;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;</a>.</p>
+<p>é常欢è¿å¤§å®¶ä¸è½½å¹¶æµè¯ v2.0.0
betaãæ¨çåé¦å¯¹æä»¬é常éè¦ï¼è¯·åé®ä»¶å° <a
href="&#109;&#097;&#105;&#108;&#116;&#111;:&#100;&#101;&#118;&#064;&#107;&#121;&#108;&#105;&#110;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;">&#100;&#101;&#118;&#064;&#107;&#121;&#108;&#105;&#110;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;</a>ã</p>
<hr />
-<h2 id="install">Install</h2>
+<h2 id="section">å®è£
</h2>
-<p>The v2.0.0 beta requires a refresh install at the moment. It cannot
be upgraded from v1.6.0 due to the incompatible metadata. However the
underlying cube is backward compatible. We are working on an upgrade tool to
transform the metadata, so that a smooth upgrade will be possible.</p>
+<p>ææ¶ v2.0.0 beta æ æ³ä» v1.6.0 ç´æ¥å级ï¼å¿
éå
¨æ°å®è£
ãè¿æ¯ç±äºæ°çæ¬çå
æ°æ®å¹¶ä¸ååå
¼å®¹ãå¥½å¨ Cube
æ°æ®æ¯ååå
¼å®¹çï¼å æ¤åªéè¦å¼åä¸ä¸ªå
æ°æ®è½¬æ¢å·¥å
·ï¼å°±è½å¨ä¸ä¹
çå°æ¥å®ç°å¹³æ»å级ãæä»¬æ£å¨ä¸ºæ¤åªåã</p>
<hr />
-<h2 id="run-tpc-h-benchmark">Run TPC-H Benchmark</h2>
+<h2 id="tpc-h-">è¿è¡ TPC-H åºåæµè¯</h2>
-<p>Steps to run TPC-H benchmark on Apache Kylin can be found here: <a
href="https://github.com/Kyligence/kylin-tpch">https://github.com/Kyligence/kylin-tpch</a></p>
+<p>å¨ Apache Kylin ä¸è¿è¡ TPC-H çå
·ä½æ¥éª¤: <a
href="https://github.com/Kyligence/kylin-tpch">https://github.com/Kyligence/kylin-tpch</a></p>
<hr />
-<h2 id="spark-cubing-engine">Spark Cubing Engine</h2>
+<h2 id="spark-">Spark æå»ºå¼æ</h2>
-<p>Apache Kylin v2.0.0 introduced a new cubing engine based on Apache
Spark that can be selected to replace the original MR engine. Initial tests
showed that the spark engine could cut the build time to 50% in most
cases.</p>
+<p>Apache Kylin v2.0.0 å¼å
¥äºä¸ä¸ªå
¨æ°çåºäº Apache Spark
çæå»ºå¼æãå®å¯ç¨äºæ¿æ¢åæç MapReduce
æå»ºå¼æã忥æµè¯æ¾ç¤º Cube çæå»ºæ¶é´ä¸è¬è½ç¼©çå°åå
ç 50% å·¦å³ã</p>
-<p>To enable the Spark cubing engine, check <a
href="/docs16/tutorial/cube_spark.html">this
tutorial</a>.</p>
+<p>å¯ç¨ Spark æå»ºå¼æï¼è¯·åè<a
href="/docs16/tutorial/cube_spark.html">è¿ç¯ææ¡£</a>.</p>
<hr />
-<p><em>Great thanks to everyone who
contributed!</em></p>
+<p><em>æè°¢æ¯ä¸ä½æåçåä¸åè´¡ç®!</em></p>
</description>
<pubDate>Sat, 25 Feb 2017 12:00:00 -0800</pubDate>
- <link>http://kylin.apache.org/blog/2017/02/25/v2.0.0-beta-ready/</link>
- <guid
isPermaLink="true">http://kylin.apache.org/blog/2017/02/25/v2.0.0-beta-ready/</guid>
+
<link>http://kylin.apache.org/cn/blog/2017/02/25/v2.0.0-beta-ready/</link>
+ <guid
isPermaLink="true">http://kylin.apache.org/cn/blog/2017/02/25/v2.0.0-beta-ready/</guid>
<category>blog</category>
@@ -120,54 +281,54 @@
</item>
<item>
- <title>Apache Kylin v2.0.0 beta åå¸</title>
- <description><p>Apache Kylin社åºé常é«å
´å°å®£å¸ <a
href="http://kylin.apache.org/cn/download/">v2.0.0 beta
package</a> å·²ç»å¯ä»¥ä¸è½½å¹¶æµè¯äºã</p>
+ <title>Apache Kylin v2.0.0 Beta Announcement</title>
+ <description><p>The Apache Kylin community is pleased to
announce the <a href="http://kylin.apache.org/download/">v2.0.0
beta package</a> is ready for download and test.</p>
<ul>
- <li>ä¸è½½é¾æ¥: <a
href="http://kylin.apache.org/cn/download/">http://kylin.apache.org/cn/download/</a></li>
- <li>æºä»£ç :
https://github.com/apache/kylin/tree/kylin-2.0.0-beta</li>
+ <li>Download link: <a
href="http://kylin.apache.org/download/">http://kylin.apache.org/download/</a></li>
+ <li>Source code:
https://github.com/apache/kylin/tree/kylin-2.0.0-beta</li>
</ul>
-<p>èªä»v1.6.0çæ¬åå¸å·²ç»2ä¸ªå¤æäºãè¿æ®µæ¶é´éï¼æ´ä¸ªç¤¾åºååå¼å宿äºä¸ç³»åé大çåè½ï¼å¸æè½å°Apache
Kylinæåå°ä¸ä¸ªæ°çé«åº¦ã</p>
+<p>It has been more than 2 month since the v1.6.0 release. The community
has been working hard to deliver some long wanted features, hoping to move
Apache Kylin to the next level.</p>
<ul>
- <li>æ¯æéªè±æ¨¡å (<a
href="https://issues.apache.org/jira/browse/KYLIN-1875">KYLIN-1875</a>)</li>
- <li>æ¯æ TPC-H æ¥è¯¢ (<a
href="https://issues.apache.org/jira/browse/KYLIN-2467">KYLIN-2467</a>)</li>
- <li>Spark æå»ºå¼æ (<a
href="https://issues.apache.org/jira/browse/KYLIN-2331">KYLIN-2331</a>)</li>
- <li>Job Engine é«å¯ç¨æ§ (<a
href="https://issues.apache.org/jira/browse/KYLIN-2006">KYLIN-2006</a>)</li>
- <li>Percentile 度é (<a
href="https://issues.apache.org/jira/browse/KYLIN-2396">KYLIN-2396</a>)</li>
- <li>å¨ Cloud ä¸éè¿æµè¯ (<a
href="https://issues.apache.org/jira/browse/KYLIN-2351">KYLIN-2351</a>)</li>
+ <li>Support snowflake data model (<a
href="https://issues.apache.org/jira/browse/KYLIN-1875">KYLIN-1875</a>)</li>
+ <li>Support TPC-H queries (<a
href="https://issues.apache.org/jira/browse/KYLIN-2467">KYLIN-2467</a>)</li>
+ <li>Spark cubing engine (<a
href="https://issues.apache.org/jira/browse/KYLIN-2331">KYLIN-2331</a>)</li>
+ <li>Job engine HA (<a
href="https://issues.apache.org/jira/browse/KYLIN-2006">KYLIN-2006</a>)</li>
+ <li>Percentile measure (<a
href="https://issues.apache.org/jira/browse/KYLIN-2396">KYLIN-2396</a>)</li>
+ <li>Cloud tested (<a
href="https://issues.apache.org/jira/browse/KYLIN-2351">KYLIN-2351</a>)</li>
</ul>
-<p>é常欢è¿å¤§å®¶ä¸è½½å¹¶æµè¯ v2.0.0
betaãæ¨çåé¦å¯¹æä»¬é常éè¦ï¼è¯·åé®ä»¶å° <a
href="&#109;&#097;&#105;&#108;&#116;&#111;:&#100;&#101;&#118;&#064;&#107;&#121;&#108;&#105;&#110;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;">&#100;&#101;&#118;&#064;&#107;&#121;&#108;&#105;&#110;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;</a>ã</p>
+<p>You are very welcome to give the v2.0.0 beta a try, and please do
send feedbacks to <a
href="&#109;&#097;&#105;&#108;&#116;&#111;:&#100;&#101;&#118;&#064;&#107;&#121;&#108;&#105;&#110;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;">&#100;&#101;&#118;&#064;&#107;&#121;&#108;&#105;&#110;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;</a>.</p>
<hr />
-<h2 id="section">å®è£
</h2>
+<h2 id="install">Install</h2>
-<p>ææ¶ v2.0.0 beta æ æ³ä» v1.6.0 ç´æ¥å级ï¼å¿
éå
¨æ°å®è£
ãè¿æ¯ç±äºæ°çæ¬çå
æ°æ®å¹¶ä¸ååå
¼å®¹ãå¥½å¨ Cube
æ°æ®æ¯ååå
¼å®¹çï¼å æ¤åªéè¦å¼åä¸ä¸ªå
æ°æ®è½¬æ¢å·¥å
·ï¼å°±è½å¨ä¸ä¹
çå°æ¥å®ç°å¹³æ»å级ãæä»¬æ£å¨ä¸ºæ¤åªåã</p>
+<p>The v2.0.0 beta requires a refresh install at the moment. It cannot
be upgraded from v1.6.0 due to the incompatible metadata. However the
underlying cube is backward compatible. We are working on an upgrade tool to
transform the metadata, so that a smooth upgrade will be possible.</p>
<hr />
-<h2 id="tpc-h-">è¿è¡ TPC-H åºåæµè¯</h2>
+<h2 id="run-tpc-h-benchmark">Run TPC-H Benchmark</h2>
-<p>å¨ Apache Kylin ä¸è¿è¡ TPC-H çå
·ä½æ¥éª¤: <a
href="https://github.com/Kyligence/kylin-tpch">https://github.com/Kyligence/kylin-tpch</a></p>
+<p>Steps to run TPC-H benchmark on Apache Kylin can be found here: <a
href="https://github.com/Kyligence/kylin-tpch">https://github.com/Kyligence/kylin-tpch</a></p>
<hr />
-<h2 id="spark-">Spark æå»ºå¼æ</h2>
+<h2 id="spark-cubing-engine">Spark Cubing Engine</h2>
-<p>Apache Kylin v2.0.0 å¼å
¥äºä¸ä¸ªå
¨æ°çåºäº Apache Spark
çæå»ºå¼æãå®å¯ç¨äºæ¿æ¢åæç MapReduce
æå»ºå¼æã忥æµè¯æ¾ç¤º Cube çæå»ºæ¶é´ä¸è¬è½ç¼©çå°åå
ç 50% å·¦å³ã</p>
+<p>Apache Kylin v2.0.0 introduced a new cubing engine based on Apache
Spark that can be selected to replace the original MR engine. Initial tests
showed that the spark engine could cut the build time to 50% in most
cases.</p>
-<p>å¯ç¨ Spark æå»ºå¼æï¼è¯·åè<a
href="/docs16/tutorial/cube_spark.html">è¿ç¯ææ¡£</a>.</p>
+<p>To enable the Spark cubing engine, check <a
href="/docs16/tutorial/cube_spark.html">this
tutorial</a>.</p>
<hr />
-<p><em>æè°¢æ¯ä¸ä½æåçåä¸åè´¡ç®!</em></p>
+<p><em>Great thanks to everyone who
contributed!</em></p>
</description>
<pubDate>Sat, 25 Feb 2017 12:00:00 -0800</pubDate>
-
<link>http://kylin.apache.org/cn/blog/2017/02/25/v2.0.0-beta-ready/</link>
- <guid
isPermaLink="true">http://kylin.apache.org/cn/blog/2017/02/25/v2.0.0-beta-ready/</guid>
+ <link>http://kylin.apache.org/blog/2017/02/25/v2.0.0-beta-ready/</link>
+ <guid
isPermaLink="true">http://kylin.apache.org/blog/2017/02/25/v2.0.0-beta-ready/</guid>
<category>blog</category>
@@ -590,56 +751,6 @@ group by grouping sets((dim1, dim2), (di
<category>blog</category>
-
- </item>
-
- <item>
- <title>Query Metrics in Apache Kylin</title>
- <description><p>Apache Kylin support query metrics since 1.5.4.
This blog will introduce why Kylin need query metrics, the concrete contents
and meaning of query metrics, the daily function of query metrics and how to
collect query metrics.</p>
-
-<h2 id="background">Background</h2>
-<p>When Kylin become an enterprise application, you must ensure Kylin
query service is high availability and high performance, besides, you need to
provide commitment of the SLA of query service to users, Which need Kylin to
support query metrics.</p>
-
-<h2 id="introduction">Introduction</h2>
-<p>The query metrics have Server, Project, Cube three levels.</p>
-
-<p>For example, <code
class="highlighter-rouge">QueryCount</code> will have three
kinds of metrics:<br />
-```<br />
-Hadoop:name=Server_Total,service=Kylin.QueryCount<br />
-Hadoop:name=learn_kylin,service=Kylin.QueryCount<br />
-Hadoop:name=learn_kylin,service=Kylin,sub=kylin_sales_cube.QueryCount</p>
-
-<p>Server_Total is represent for a query server node,<br />
-learn_kylin is a project name,<br />
-kylin_sales_cube is a cube name.<br />
-```<br />
-### The Key Query Metrics</p>
-
-<ul>
- <li><code
class="highlighter-rouge">QueryCount</code>: the total of
query count.</li>
- <li><code
class="highlighter-rouge">QueryFailCount</code>: the total
of failed query count.</li>
- <li><code
class="highlighter-rouge">QuerySuccessCount</code>: the
total of successful query count.</li>
- <li><code
class="highlighter-rouge">CacheHitCount</code>: the total of
query cache hit count.</li>
- <li><code
class="highlighter-rouge">QueryLatency60s99thPercentile</code>:
the 99th percentile of query latency in the 60s.(there are 99th, 95th, 90th,
75th, 50th five percentiles and 60s, 360s, 3600s three time intervals in Kylin
query metrics. the time intervals could set by <code
class="highlighter-rouge">kylin.query.metrics.percentiles.intervals</code>,
which default value is <code class="highlighter-rouge">60,
360, 3600</code>)</li>
- <li><code
class="highlighter-rouge">QueryLatencyAvgTime</code>ï¼<code
class="highlighter-rouge">QueryLatencyIMaxTime</code>ï¼<code
class="highlighter-rouge">QueryLatencyIMinTime</code>: the
average, max, min of query latency.</li>
- <li><code
class="highlighter-rouge">ScanRowCount</code>: the rows
count of scan HBase, itâs like <code
class="highlighter-rouge">QueryLatency</code>.</li>
- <li><code
class="highlighter-rouge">ResultRowCount</code>: the result
count of query, itâs like <code
class="highlighter-rouge">QueryLatency</code>.</li>
-</ul>
-
-<h2 id="daily-function">Daily Function</h2>
-<p>Besides providing SLA of query service to users, in the daily
operation and maintenance, you could make Kylin query daily and Kylin query
dashboard by query metrics. Which will help you know the rules, performance of
Kylin query and analyze the Kylin query accident case.</p>
-
-<h2 id="how-to-use">How To Use</h2>
-<p>Firstly, you should set config <code
class="highlighter-rouge">kylin.query.metrics.enabled</code>
as true to collect query metrics to JMX.</p>
-
-<p>Secondly, you could use arbitrary JMX collection tool to collect the
query metrics to your monitor system. Notice that, The query metrics have
Server, Project, Cube three levels, which was implemented by dynamic <code
class="highlighter-rouge">ObjectName</code>, so you should
get <code class="highlighter-rouge">ObjectName</code> by
regular expression.</p>
-</description>
- <pubDate>Sat, 27 Aug 2016 10:30:00 -0700</pubDate>
-
<link>http://kylin.apache.org/blog/2016/08/27/query-metrics-in-kylin/</link>
- <guid
isPermaLink="true">http://kylin.apache.org/blog/2016/08/27/query-metrics-in-kylin/</guid>
-
-
- <category>blog</category>
</item>