Author: lidong
Date: Sat Jul 22 02:07:05 2017
New Revision: 1802650
URL: http://svn.apache.org/viewvc?rev=1802650&view=rev
Log:
fix format
Added:
kylin/site/blog/2017/07/
kylin/site/blog/2017/07/21/
kylin/site/blog/2017/07/21/Improving-Spark-Cubing/
kylin/site/blog/2017/07/21/Improving-Spark-Cubing/index.html
Modified:
kylin/site/blog/index.html
kylin/site/feed.xml
Added: kylin/site/blog/2017/07/21/Improving-Spark-Cubing/index.html
URL:
http://svn.apache.org/viewvc/kylin/site/blog/2017/07/21/Improving-Spark-Cubing/index.html?rev=1802650&view=auto
==============================================================================
--- kylin/site/blog/2017/07/21/Improving-Spark-Cubing/index.html (added)
+++ kylin/site/blog/2017/07/21/Improving-Spark-Cubing/index.html Sat Jul 22
02:07:05 2017
@@ -0,0 +1,410 @@
+<!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+<!doctype html>
+<html>
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<head>
+ <meta charset="utf-8">
+ <meta http-equiv="X-UA-Compatible" content="IE=edge">
+ <meta name="viewport" content="width=device-width, initial-scale=1">
+
+ <title>Apache Kylin | Improving Spark Cubing in Kylin 2.0</title>
+ <meta name="description" content="Apache Kylin is a OALP Engine that
speeding up query by Cube precomputation. The Cube is multi-dimensional dataset
which contain precomputed all measures in ...">
+ <meta name="author" content="Apache Kylin">
+ <link rel="shortcut icon" href="fav.png" type="image/png">
+
+
+
+<link rel="stylesheet" href="/assets/css/animate.css">
+<!-- Bootstrap -->
+<link rel="stylesheet" href="/assets/css/bootstrap.min.css">
+
+<!-- Fonts -->
+<!-- <link rel="stylesheet"
href="http://fonts.googleapis.com/css?family=Alice|Open+Sans:400,300,700"> -->
+
+<!-- Icons -->
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+ <!-- Custom styles -->
+ <link rel="stylesheet" href="/assets/css/styles.css">
+ <link rel="stylesheet" href="/assets/css/docs.css">
+ <link rel="stylesheet" href="/assets/css/pygments.css">
+
+ <link rel="canonical"
href="http://kylin.apache.org/blog/2017/07/21/Improving-Spark-Cubing/">
+ <link rel="alternate" type="application/rss+xml" title="Apache Kylin"
href="http://kylin.apache.org/feed.xml" />
+
+<!--[if lt IE 9]> <script src="assets/js/html5shiv.js"></script> <![endif]-->
+<script>
+ (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new
Date();a=s.createElement(o),
+
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+ })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+ //oringal tracker for kylin.io
+ ga('create', 'UA-55534813-1', 'auto');
+ //new tracker for kylin.apache.org
+ ga('create', 'UA-55534813-2', 'auto', {'name':'toplevel'});
+
+ ga('send', 'pageview');
+ ga('toplevel.send', 'pageview');
+
+
+</script>
+<script type="text/javascript" src="/assets/js/jquery-1.9.1.min.js"></script>
+<script type="text/javascript" src="/assets/js/nside.js"></script> </script>
+<script type="text/javascript" src="/assets/js/nnav.js"></script> </script>
+</head>
+
+ <body>
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<header id="header" >
+
+ <div id="head" class="parallax" parallax-speed="3" >
+ <div id="logo" class="text-center"> <img class="img-circle"
id="circlelogo" src="/assets/images/kylin_logo.jpg"> <span class="title"
>Apache Kylinâ¢</span> <span class="tagline">Extreme OLAP Engine for Big
Data</span>
+ </div>
+ <div class="text-center" style="
+ position: relative;
+ top: 66px;
+ width: 1080px;
+ margin: 0 auto;
+ z-index: 11;
+ margin-top: -253px;
+ text-align: right;"
+ >
+ <a href="http://apache.org/foundation/contributing.html" title="Support
Apache" style="margin-left: 150px;">
+ <img src="https://www.apache.org/images/SupportApache-small.png"
style="height: 150px; width: 150px;">
+ </a>
+ </div>
+ </div>
+
+
+ <!-- Main Menu -->
+ <nav class="navbar navbar-default" role="navigation" id="nav-wrapper">
+ <div class="container-fluid" id="nav">
+ <!--
+ <img class="img-circle" width="40px" height="40px" id="circlelogo"
src="/assets/images/kylin_logo.jpg">
+ -->
+ <!-- Brand and toggle get grouped for better mobile display -->
+ <div class="navbar-header">
+ <button type="button" class="navbar-toggle collapsed"
data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
+ <span class="sr-only">Toggle navigation</span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ </button>
+
+ </div>
+
+ <!-- Collect the nav links, forms, and other content for toggling -->
+ <div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
+ <ul class="nav navbar-nav">
+ <li><a href="/">Home</a></li>
+ <li><a href="/docs20" >Docs</a></li>
+ <li><a href="/download">Download</li>
+ <li><a href="/community" >Community</a></li>
+ <li><a href="/development" >Development</a></li>
+ <li><a href="/blog">Blog</li>
+ <li><a href="/cn" >䏿ç</a></li>
+ <li><a href="https://twitter.com/apachekylin" target="_blank"
class="fa fa-twitter fa-lg" title="Twitter: @ApacheKylin" ></a></li>
+ <li><a href="https://github.com/apache/kylin" target="_blank"
class="fa fa-github-alt fa-lg" title="Github: apache/kylin" ></a></li>
+ <li><a href="https://www.facebook.com/kylinio" target="_blank"
class="fa fa-facebook fa-lg" title="Facebook: kylin.io" ></a></li>
+ </ul>
+ </div><!-- /.navbar-collapse -->
+ </div><!-- /.container-fluid -->
+</nav>
+ </header>
+
+ <div class="page-content">
+ <header style=" padding:2em 0 0 0">
+ <div class="container" >
+ <h4 class="section-title"><span>Apache Kylinâ¢
Technical Blog</span></h4>
+ </div>
+ </div>
+
+ <div class="container">
+ <div>
+ <article class="post-content" >
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<div class="post" style=" padding:2em 4em 4em 4em">
+
+ <header class="post-header">
+ <h1 class="post-title">Improving Spark Cubing in Kylin 2.0</h1>
+ <p class="post-meta" >Jul 21, 2017 ⢠Kaisen Kang</p>
+ </header>
+
+ <article class="post-content" >
+ <p>Apache Kylin is a OALP Engine that speeding up query by Cube
precomputation. The Cube is multi-dimensional dataset which contain precomputed
all measures in all dimension combinations. Before v2.0, Kylin uses MapReduce
to build Cube. In order to get better performance, Kylin 2.0 introduced the
Spark Cubing. About the principle of Spark Cubing, please refer to the article
<a
href="http://kylin.apache.org/blog/2017/02/23/by-layer-spark-cubing/">By-layer
Spark Cubing</a>.</p>
+
+<p>In this blog, I will talk about the following topics:</p>
+
+<ul>
+ <li>How to make Spark Cubing support HBase cluster with Kerberos enabled</li>
+ <li>Spark configurations for Cubing</li>
+ <li>Performance of Spark Cubing</li>
+ <li>Pros and cons of Spark Cubing</li>
+ <li>Applicable scenarios of Spark Cubing</li>
+ <li>Improvement for dictionary loading in Spark Cubing</li>
+</ul>
+
+<p>In currently Spark Cubing(2.0) version, it doesnât support HBase cluster
using Kerberos bacause Spark Cubing need to get matadata from HBase. To solve
this problem, we have two solutions: one is to make Spark could connect HBase
with Kerberos, the other is to avoid Spark connect to HBase in Spark Cubing.</p>
+
+<h3 id="make-spark-connect-hbase-with-kerberos-enabled">Make Spark connect
HBase with Kerberos enabled</h3>
+<p>If just want to run Spark Cubing in Yarn client mode, we only need to add
three line code before new SparkConf() in SparkCubingByLayer:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>
Configuration configuration = HBaseConnection.getCurrentHBaseConfiguration();
+ HConnection connection =
HConnectionManager.createConnection(configuration);
+ //Obtain an authentication token for the given user and add it to the
user's credentials.
+ TokenUtil.obtainAndCacheToken(connection,
UserProvider.instantiate(configuration).create(UserGroupInformation.getCurrentUser()));
+</code></pre>
+</div>
+
+<p>As for How to make Spark connect HBase using Kerberos in Yarn cluster mode,
please refer to SPARK-6918, SPARK-12279, and HBASE-17040. The solution may
work, but not elegant. So I tried the sencond solution.</p>
+
+<h3 id="use-hdfs-metastore-for-spark-cubing">Use HDFS metastore for Spark
Cubing</h3>
+
+<p>The core idea here is uploading the necessary metadata job related to HDFS
and using HDFSResourceStore manage the metadata.</p>
+
+<p>Before introducing how to use HDFSResourceStore instead of
HBaseResourceStore in Spark Cubing. Letâs see whatâs Kylin metadata format
and how Kylin manages the metadata.</p>
+
+<p>Every concrete metadata for table, cube, model and project is a JSON file
in Kylin. The whole metadata is organized by file directory. The picture below
is the root directory for Kylin metadata,<br />
+<img
src="http://static.zybuluo.com/kangkaisen/t1tc6neiaebiyfoir4fdhs11/%E5%B1%8F%E5%B9%95%E5%BF%AB%E7%85%A7%202017-07-02%20%E4%B8%8B%E5%8D%883.51.43.png"
alt="å±å¹å¿«ç
§ 2017-07-02 ä¸å3.51.43.png-20.7kB" /><br />
+This following picture shows the content of project dir, the âlearn_kylinâ
and âkylin_testâ are both project names.<br />
+<img
src="http://static.zybuluo.com/kangkaisen/4dtiioqnw08w6vtj0r9u5f27/%E5%B1%8F%E5%B9%95%E5%BF%AB%E7%85%A7%202017-07-02%20%E4%B8%8B%E5%8D%883.54.59.png"
alt="å±å¹å¿«ç
§ 2017-07-02 ä¸å3.54.59.png-11.8kB" /></p>
+
+<p>Kylin manage the metadata using ResourceStore, ResourceStore is a abstract
class, which abstract the CRUD Interface for metadata. ResourceStore has three
implementation classesï¼</p>
+
+<ul>
+ <li>FileResourceStore (store with Local FileSystem)</li>
+ <li>HDFSResourceStore</li>
+ <li>HBaseResourceStore</li>
+</ul>
+
+<p>Currently, only HBaseResourceStore could use in production env.
FileResourceStore mainly used for testing. HDFSResourceStore doesnât support
massive concurrent write, but it is ideal to use for read only scenario like
Cubing. Kylin use the âkylin.metadata.urlâ config to decide which kind of
ResourceStore will be used.</p>
+
+<p>Now, Letâs see How to use HDFSResourceStore instead of HBaseResourceStore
in Spark Cubing.</p>
+
+<ol>
+ <li>Determine the necessary metadata for Spark Cubing job</li>
+ <li>Dump the necessary metadata from HBase to local</li>
+ <li>Update the kylin.metadata.url and then write all Kylin config to
âkylin.propertiesâ file in local metadata dir.</li>
+ <li>Use ResourceTool upload the local metadata to HDFS.</li>
+ <li>Construct the HDFSResourceStore from the HDFS âkylin.propertiesâ
file in Spark executor.</li>
+</ol>
+
+<p>Of course, We need to delete the HDFS metadata dir on complete. Iâm
working on a patch for this, please watch KYLIN-2653 for update.</p>
+
+<h3 id="spark-configurations-for-cubing">Spark configurations for Cubing</h3>
+
+<p>Following is the Spark configuration I used in our environment. It enables
Spark dynamic resource allocation; the goal is to let our user set less Spark
configurations.</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>//running in
yarn-cluster mode
+kylin.engine.spark-conf.spark.master=yarn
+kylin.engine.spark-conf.spark.submit.deployMode=cluster
+
+//enable the dynamic allocation for Spark to avoid user set the number of
executors explicitly
+kylin.engine.spark-conf.spark.dynamicAllocation.enabled=true
+kylin.engine.spark-conf.spark.dynamicAllocation.minExecutors=10
+kylin.engine.spark-conf.spark.dynamicAllocation.maxExecutors=1024
+kylin.engine.spark-conf.spark.dynamicAllocation.executorIdleTimeout=300
+kylin.engine.spark-conf.spark.shuffle.service.enabled=true
+kylin.engine.spark-conf.spark.shuffle.service.port=7337
+
+//the memory config
+kylin.engine.spark-conf.spark.driver.memory=4G
+//should enlarge the executor.memory when the cube dict is huge
+kylin.engine.spark-conf.spark.executor.memory=4G
+//because kylin need to load the cube dict in executor
+kylin.engine.spark-conf.spark.executor.cores=1
+
+//enlarge the timeout
+kylin.engine.spark-conf.spark.network.timeout=600
+
+kylin.engine.spark-conf.spark.yarn.queue=root.hadoop.test
+
+kylin.engine.spark.rdd-partition-cut-mb=100
+</code></pre>
+</div>
+
+<h3 id="performance-test-of-spark-cubing">Performance test of Spark Cubing</h3>
+
+<p>For the source data scale from millions to hundreds of millions, my test
result is consistent with the blog <a
href="http://kylin.apache.org/blog/2017/02/23/by-layer-spark-cubing/">By-layer
Spark Cubing</a>. The improvement is remarkable. Moreover, I also tested with
billions of source data and having huge dictionary specially.</p>
+
+<p>The test Cube1 has 2.7 billion source data, 9 dimensions, one precise
distinct count measure having 70 million cardinality (which means the dict also
has 70 million cardinality).</p>
+
+<p>Test test Cube2 has 2.4 billion source data, 13 dimensions, 38
measures(contains 9 precise distinct count measures).</p>
+
+<p>The test result is shown in below picture, the unit of time is minute.<br />
+<img
src="http://static.zybuluo.com/kangkaisen/1urzfkal8od52fodi1l6u0y5/image.png"
alt="image.png-38.1kB" /></p>
+
+<p>In one word, <strong>Spark Cubing is much faster than MR cubing in most
scenes</strong>.</p>
+
+<h3 id="pros-and-cons-of-spark-cubing">Pros and Cons of Spark Cubing</h3>
+<p>In my opinion, the advantage for Spark Cubing includes:</p>
+
+<ol>
+ <li>Because of the RDD cache, Spark Cubing could take full advantage of
memory to avoid disk I/O.</li>
+ <li>When we have enough memory resource, Spark Cubing could use more memory
resource to get better build performance.</li>
+</ol>
+
+<p>On the contraryï¼the drawback for Spark Cubing includes:</p>
+
+<ol>
+ <li>Spark Cubing couldnât handle huge dictionary well (hundreds of
millions of cardinality);</li>
+ <li>Spark Cubing isnât stable enough for very large scale data.</li>
+</ol>
+
+<h3 id="applicable-scenarios-of-spark-cubing">Applicable scenarios of Spark
Cubing</h3>
+<p>In my opinion, except the huge dictionary scenario, we all could use Spark
Cubing to replace MR Cubing, especially under the following scenarios:</p>
+
+<ol>
+ <li>Many dimensions</li>
+ <li>Normal dictionaries (e.g, cardinality < 1 hundred millions)</li>
+ <li>Normal scale data (e.g, less than 10 billion rows to build at once).</li>
+</ol>
+
+<h3 id="improvement-for-dictionary-loading-in-spark-cubing">Improvement for
dictionary loading in Spark Cubing</h3>
+
+<p>As we all known, a big difference for MR and Spark is, the task for MR is
running in process, but the task for Spark is running in thread. So, in MR
Cubing, the dict of Cube only load once, but in Spark Cubing, the dict will be
loaded many times in one executor, which will cause frequent GC.</p>
+
+<p>So, I made the two improvements:</p>
+
+<ol>
+ <li>Only load the dict once in one executor.</li>
+ <li>Add maximumSize for LoadingCache in the AppendTrieDictionary to make the
dict removed as early as possible.</li>
+</ol>
+
+<p>These two improvements have been contributed into Kylin repository.</p>
+
+<h3 id="summary">Summary</h3>
+<p>Spark Cubing is a great feature for Kylin 2.0, Thanks Kylin community. We
will apply Spark Cubing in real scenarios in our company. I believe Spark
Cubing will be more robust and efficient in the future releases.</p>
+
+
+ </article>
+
+</div>
+
+
+
+
+
+ </article>
+ </div>
+ </div>
+ <!--
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements. See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership. The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* "License"); you may not use this file except in compliance
+* with the License. You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+-->
+
+<footer id="underfooter">
+ <div class="container">
+ <div class="row">
+ <div class="col-md-12 widget">
+ <div class="widget-body" style="text-align:center">
+ <a href="http://www.apache.org">
+ <img id="asf-logo" alt="Apache Software Foundation"
src="/assets/images/feather-small.gif">
+ </a>
+
+ <div>
+ The contents of this website are © 2015 Apache
Software Foundation under the terms of the <a
+ href="http://www.apache.org/licenses/LICENSE-2.0">
Apache License v2 </a>. Apache Kylin and
+ its logo are trademarks of the Apache Software
Foundation.
+ </div>
+
+ </div>
+ </div>
+ </div>
+ <!-- /row of widgets -->
+
+ </div>
+ <div></div>
+
+</footer>
+
+ <script src="/assets/js/jquery-1.9.1.min.js"></script>
+ <script src="/assets/js/bootstrap.min.js"></script>
+ <script src="/assets/js/main.js"></script>
+ </body>
+</html>
+
+
+
+
Modified: kylin/site/blog/index.html
URL:
http://svn.apache.org/viewvc/kylin/site/blog/index.html?rev=1802650&r1=1802649&r2=1802650&view=diff
==============================================================================
--- kylin/site/blog/index.html (original)
+++ kylin/site/blog/index.html Sat Jul 22 02:07:05 2017
@@ -187,6 +187,12 @@
<li>
<h2 align="left" style="margin:0px">
+ <a class="post-link"
href="/blog/2017/07/21/Improving-Spark-Cubing/">Improving Spark Cubing in Kylin
2.0</a></h2><div align="left" class="post-meta">posted: Jul 21, 2017</div>
+
+ </li>
+
+ <li>
+ <h2 align="left" style="margin:0px">
<a class="post-link" href="/blog/2017/04/01/percentile-measure/">A
new measure for Percentile precalculation</a></h2><div align="left"
class="post-meta">posted: Apr 1, 2017</div>
</li>
@@ -283,25 +289,25 @@
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link"
href="/cn/blog/2016/05/26/release-v1.5.2/">Apache Kylin v1.5.2
æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: May 26,
2016</div>
+ <a class="post-link" href="/blog/2016/05/26/release-v1.5.2/">Apache
Kylin v1.5.2 Release Announcement</a></h2><div align="left"
class="post-meta">posted: May 26, 2016</div>
</li>
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link" href="/blog/2016/05/26/release-v1.5.2/">Apache
Kylin v1.5.2 Release Announcement</a></h2><div align="left"
class="post-meta">posted: May 26, 2016</div>
+ <a class="post-link"
href="/cn/blog/2016/05/26/release-v1.5.2/">Apache Kylin v1.5.2
æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: May 26,
2016</div>
</li>
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link" href="/blog/2016/04/12/release-v1.5.1/">Apache
Kylin v1.5.1 Release Announcement</a></h2><div align="left"
class="post-meta">posted: Apr 12, 2016</div>
+ <a class="post-link"
href="/cn/blog/2016/04/12/release-v1.5.1/">Apache Kylin v1.5.1
æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: Apr 12,
2016</div>
</li>
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link"
href="/cn/blog/2016/04/12/release-v1.5.1/">Apache Kylin v1.5.1
æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: Apr 12,
2016</div>
+ <a class="post-link" href="/blog/2016/04/12/release-v1.5.1/">Apache
Kylin v1.5.1 Release Announcement</a></h2><div align="left"
class="post-meta">posted: Apr 12, 2016</div>
</li>
@@ -325,13 +331,13 @@
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link" href="/blog/2016/03/16/release-v1.3.0/">Apache
Kylin v1.3.0 Release Announcement</a></h2><div align="left"
class="post-meta">posted: Mar 16, 2016</div>
+ <a class="post-link"
href="/cn/blog/2016/03/16/release-v1.3.0/">Apache Kylin v1.3.0
æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: Mar 16,
2016</div>
</li>
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link"
href="/cn/blog/2016/03/16/release-v1.3.0/">Apache Kylin v1.3.0
æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: Mar 16,
2016</div>
+ <a class="post-link" href="/blog/2016/03/16/release-v1.3.0/">Apache
Kylin v1.3.0 Release Announcement</a></h2><div align="left"
class="post-meta">posted: Mar 16, 2016</div>
</li>
@@ -361,13 +367,13 @@
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link" href="/cn/blog/2015/12/23/release-v1.2/">Apache
Kylin v1.2 æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: Dec
23, 2015</div>
+ <a class="post-link" href="/blog/2015/12/23/release-v1.2/">Apache
Kylin v1.2 Release Announcement</a></h2><div align="left"
class="post-meta">posted: Dec 23, 2015</div>
</li>
<li>
<h2 align="left" style="margin:0px">
- <a class="post-link" href="/blog/2015/12/23/release-v1.2/">Apache
Kylin v1.2 Release Announcement</a></h2><div align="left"
class="post-meta">posted: Dec 23, 2015</div>
+ <a class="post-link" href="/cn/blog/2015/12/23/release-v1.2/">Apache
Kylin v1.2 æ£å¼åå¸</a></h2><div align="left" class="post-meta">posted: Dec
23, 2015</div>
</li>
Modified: kylin/site/feed.xml
URL:
http://svn.apache.org/viewvc/kylin/site/feed.xml?rev=1802650&r1=1802649&r2=1802650&view=diff
==============================================================================
--- kylin/site/feed.xml (original)
+++ kylin/site/feed.xml Sat Jul 22 02:07:05 2017
@@ -19,19 +19,13 @@
<description>Apache Kylin Home</description>
<link>http://kylin.apache.org/</link>
<atom:link href="http://kylin.apache.org/feed.xml" rel="self"
type="application/rss+xml"/>
- <pubDate>Fri, 21 Jul 2017 18:39:35 -0700</pubDate>
- <lastBuildDate>Fri, 21 Jul 2017 18:39:35 -0700</lastBuildDate>
+ <pubDate>Fri, 21 Jul 2017 19:05:44 -0700</pubDate>
+ <lastBuildDate>Fri, 21 Jul 2017 19:05:44 -0700</lastBuildDate>
<generator>Jekyll v2.5.3</generator>
<item>
- <title>Improving Spark Cubing</title>
- <description><h1
id="improving-spark-cubing-in-kylin-20">Improving Spark Cubing in
Kylin 2.0</h1>
-
-<p>Author: Kaisen Kang</p>
-
-<hr />
-
-<p>Apache Kylin is a OALP Engine that speeding up query by Cube
precomputation. The Cube is multi-dimensional dataset which contain precomputed
all measures in all dimension combinations. Before v2.0, Kylin uses MapReduce
to build Cube. In order to get better performance, Kylin 2.0 introduced the
Spark Cubing. About the principle of Spark Cubing, please refer to the article
<a
href="http://kylin.apache.org/blog/2017/02/23/by-layer-spark-cubing/">By-layer
Spark Cubing</a>.</p>
+ <title>Improving Spark Cubing in Kylin 2.0</title>
+ <description><p>Apache Kylin is a OALP Engine that speeding up
query by Cube precomputation. The Cube is multi-dimensional dataset which
contain precomputed all measures in all dimension combinations. Before v2.0,
Kylin uses MapReduce to build Cube. In order to get better performance, Kylin
2.0 introduced the Spark Cubing. About the principle of Spark Cubing, please
refer to the article <a
href="http://kylin.apache.org/blog/2017/02/23/by-layer-spark-cubing/">By-layer
Spark Cubing</a>.</p>
<p>In this blog, I will talk about the following topics:</p>
@@ -177,10 +171,12 @@ kylin.engine.spark.rdd-partition-cut-mb=
<p>Spark Cubing is a great feature for Kylin 2.0, Thanks Kylin
community. We will apply Spark Cubing in real scenarios in our company. I
believe Spark Cubing will be more robust and efficient in the future
releases.</p>
</description>
- <pubDate>Fri, 21 Jul 2017 00:00:00 -0700</pubDate>
- <link>http://kylin.apache.org/2017/07/21/Improving-Spark-Cubing/</link>
- <guid
isPermaLink="true">http://kylin.apache.org/2017/07/21/Improving-Spark-Cubing/</guid>
+ <pubDate>Fri, 21 Jul 2017 15:22:22 -0700</pubDate>
+
<link>http://kylin.apache.org/blog/2017/07/21/Improving-Spark-Cubing/</link>
+ <guid
isPermaLink="true">http://kylin.apache.org/blog/2017/07/21/Improving-Spark-Cubing/</guid>
+
+ <category>blog</category>
</item>