Modified: kylin/site/feed.xml
URL: 
http://svn.apache.org/viewvc/kylin/site/feed.xml?rev=1894464&r1=1894463&r2=1894464&view=diff
==============================================================================
--- kylin/site/feed.xml (original)
+++ kylin/site/feed.xml Fri Oct 22 05:11:34 2021
@@ -19,11 +19,162 @@
     <description>Apache Kylin Home</description>
     <link>http://kylin.apache.org/</link>
     <atom:link href="http://kylin.apache.org/feed.xml"; rel="self" 
type="application/rss+xml"/>
-    <pubDate>Wed, 08 Sep 2021 00:12:30 -0700</pubDate>
-    <lastBuildDate>Wed, 08 Sep 2021 00:12:30 -0700</lastBuildDate>
+    <pubDate>Thu, 21 Oct 2021 22:00:29 -0700</pubDate>
+    <lastBuildDate>Thu, 21 Oct 2021 22:00:29 -0700</lastBuildDate>
     <generator>Jekyll v2.5.3</generator>
     
       <item>
+        <title>Kylin4 
云上性能优化:本地缓存和软亲和性调度</title>
+        <description>&lt;h2 id=&quot;section&quot;&gt;01 
背景介绍&lt;/h2&gt;
+&lt;p&gt;日前,Apache Kylin 社区发布了全新架构的 Kylin 
4.0。Kylin 4.0 的架构支持存储和计算分离,这使得 kylin 
用户可以采取更加
灵活、计算资源可以弹性伸缩的云上部署方式来运行 Kylin 
4.0。借助云上的基础设施,用户可以选择使用便宜且可靠
的对象存储来储存 cube 数据,比如 S3 
等。不过在存储与计算分离的架构下,我们需要考虑到,计算节点通过网络从远端存储读取数据仍然是一个代价较大的操作,往å¾
 €ä¼šå¸¦æ¥æ€§èƒ½çš„损耗。&lt;br /&gt;
+为了提高 Kylin 4.0 
在使用云上对象存储作为存储时的查询性能,我们尝试在 
Kylin 4.0 的查询引擎中引入本地缓存(Local 
Cache)机制,在执行查询时,将经常使用的数据缓存在本地磁盘,减小从远程对象存储中拉取数据带来的延迟,实现更快的查询响应;除此之外,为了避å
…åŒæ ·çš„数据在大量 spark executor 
上同时缓存浪费磁盘空间,并且计算节点可以更多的从本地缓存读取所需数据,我们引å
…¥äº† 软äº�
 �和性(Soft Affinity 
)的调度策略,所谓软亲和性策略,就是通过某种方法在 
spark executor 和数据文件之间建立对应关系,使得同æ 
·çš„数据在大部分情况下能够总是在同一个 executor 
上面读取,从而提高缓存的命中率。&lt;/p&gt;
+
+&lt;h2 id=&quot;section-1&quot;&gt;02 实现原理&lt;/h2&gt;
+
+&lt;h4 id=&quot;section-2&quot;&gt;1.本地缓存&lt;/h4&gt;
+&lt;p&gt;在 Kylin 4.0 执行查询时,主要经过以下几个阶段,å…
¶ä¸­ç”¨è™šçº¿æ 
‡æ³¨å‡ºäº†å¯ä»¥ä½¿ç”¨æœ¬åœ°ç¼“存来提升性能的阶段:&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;/images/blog/local-cache/Local_cache_stage.png&quot; alt=&quot;&quot; 
/&gt;&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;File list cache:在 spark driver 端对 file status 
进行缓存。在执行查询时,spark driver 
需要读取文件列表,获取一些文件信息进行后续的调度执行,这里会将
 file status 信息缓存到本地避å…
é¢‘繁读取远程文件目录。&lt;/li&gt;
+  &lt;li&gt;Data cache:在 spark executor 
端对数据进行缓存。用户可以设置将数据缓存到内
存或是磁盘,若设置为缓存到内存,则需要适当调大 executor 
memory,保证 executor 有足够的内
存可以进行数据缓存;若是缓存到磁盘,需要用户设置数据缓存目录,最好设置为
 SSD 
磁盘目录。除此之外,缓存数据的最大容量、备份数量等均可由用户é
…ç½®è°ƒæ•´ã€‚&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;基于以上设计,在 Kylin 4.0 的查询引擎 sparder 的 driver 
端和 executor 
端分别做不同类型的缓存,基本架构如下:&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;/images/blog/local-cache/kylin4_local_cache.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;h4 id=&quot;section-3&quot;&gt;2.软亲和性调度&lt;/h4&gt;
+&lt;p&gt;在 executor 端做 data cache 时,如果在所有的 executor 
上都缓存å…
¨éƒ¨çš„数据,那么缓存数据的大小将会非常可观,极大的浪费磁盘空间,同时也容易导致缓存数据被频繁æ¸
…理。为了最大化 spark executor 的缓存命中率,spark driver 
需要将同一文件的 task 在资源条件满足的情
况下尽可能调度到同样的 executor,这æ 
·å¯ä»¥ä¿è¯ç›¸åŒæ–‡ä»¶çš„数据能够缓存在特定的某个或者某几个 
executor 上,再次读取时便可以通过缓存读取数æ
 ®ã€‚&lt;br /&gt;
+为此,我们采取根据文件名计算 hash 之后再与 executors num 
取模的结果来计算目标 executor 列表,在多少个 executor 
上面做缓存由用户配置的缓存备份数量决定,一般情
况下,缓存备份数量越大,击中缓存的概率越高。当目标 
executor 均不可达或者没有资源供调度时,调度程序将回退到 
spark 
的随机调度机制上。这种调度方式便称为软亲和性调度策略,它虽然不能保证
 100% 击中缓存,但能够有效提高缓存命ä�
 �­çŽ‡ï¼Œåœ¨å°½é‡ä¸æŸå¤±æ€§èƒ½çš„前提下避免 full cache 
浪费大量磁盘空间。&lt;/p&gt;
+
+&lt;h2 id=&quot;section-4&quot;&gt;03 相关配置&lt;/h2&gt;
+&lt;p&gt;根据以上原理,我们在 Kylin 4.0 
中实现了本地缓存+软亲和性调度的基础功能,并分别基于 
ssb 数据集和 tpch 数据集做了查询性能测试。&lt;br /&gt;
+这里列出几个比较重要的配置项供用户了解,实际使用的é…
ç½®å°†åœ¨ç»“尾链接中给出:&lt;br /&gt;
+- 
是否开启软亲和性调度策略:kylin.query.spark-conf.spark.kylin.soft-affinity.enabled&lt;br
 /&gt;
+- 
是否开启本地缓存:kylin.query.spark-conf.spark.hadoop.spark.kylin.local-cache.enabled&lt;br
 /&gt;
+- Data cache 的备份数量,即在多少个 executor 
上对同一数据文件进行缓存:kylin.query.spark-conf.spark.kylin.soft-affinity.replications.num&lt;br
 /&gt;
+- 缓存到内存中还是本地目录,缓存到内存设置为 
BUFF,缓存到本地设置为 
LOCAL:kylin.query.spark-conf.spark.hadoop.alluxio.user.client.cache.store.type&lt;br
 /&gt;
+- 
最大缓存容量:kylin.query.spark-conf.spark.hadoop.alluxio.user.client.cache.size&lt;/p&gt;
+
+&lt;h2 id=&quot;section-5&quot;&gt;04 性能对比&lt;/h2&gt;
+&lt;p&gt;我们在 AWS EMR 环境下进行了 3 种场景的性能测试,在 
scale factor = 10的情况下,对 ssb 
数据集进行单并发查询测试、tpch 
数据集进行单并发查询以及 4 并发查询测试,实验组和对ç…
§ç»„均配置 s3 
作为存储,在实验组中开启本地缓存和软亲和性调度,对ç…
§ç»„则不开启。除此之外,我们还将实验组结果与相同环境下 
hdfs 
作为存储时的结果进行对比,以便用户可以直观的感受到 
本地缓存+软亲和性调度 对ä
 º‘上部署 Kylin 4.0 
使用对象存储作为存储场景下的优化效果。&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;/images/blog/local-cache/local_cache_benchmark_result_ssb.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;/images/blog/local-cache/local_cache_benchmark_result_tpch1.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;/images/blog/local-cache/local_cache_benchmark_result_tpch4.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;从以上结果可以看出:&lt;br /&gt;
+1. 在 ssb 10 数据集单并发场景下,使用 s3 
作为存储时,开启本地缓存和软亲和性调度能够获得3倍左右的性能提升,可以达到与
 hdfs 作为存储时的相同性能甚至还有 5% 左右的提升。&lt;br 
/&gt;
+2. 在 tpch 10 数据集下,使用 s3 作为存储时,无
论是单并发查询还是多并发查询,开启本地缓存和软亲和性调度后,基本在所有查询中都能够获得大å¹
…度的性能提升。&lt;/p&gt;
+
+&lt;p&gt;不过在 tpch 10 数据集的 4 并发测试下的 Q21 
的对比结果中,我们观察到,开启本地缓存和软亲和性调度的结果反而比单独使用
 s3 作为存储时有所下降,这里可能是由于某种原因
导致没有通过缓存读取数据,深层原因
在此次测试中没有进行进一步的分析,在后续的优化过程中我们会逐步改进。由于
 tpch 的查询比较复杂且 SQL 类型各异,与 hdfs 
作为存储时的结果相比,仍然有部分 sql 的性能略æ�
 �‰ä¸è¶³ï¼Œä¸è¿‡æ€»ä½“来说已经与 hdfs 的结果比较接近。&lt;br 
/&gt;
+本次性能测试的结果是一次对 本地缓存+软亲和性调度 
性能提升效果的初步验证,从总体上来看,本地缓存+软亲和性调度
 无
论对于简单查询还是复杂查询都能够获得明显的性能提升,但是在高并发查询场景下存在一定的性能损失。&lt;br
 /&gt;
+如果用户使用云上对象存储作为 Kylin 4.0 的存储,在开启 
本地缓存+ 软亲和性调度的情
况下,是可以获得很好的性能体验的,这为 Kylin 4.0 
在云上使用计算和存储分离架构提供了性能保障。&lt;/p&gt;
+
+&lt;h2 id=&quot;section-6&quot;&gt;05 代码实现&lt;/h2&gt;
+&lt;p&gt;由于目前的代ç 
å®žçŽ°è¿˜å¤„于比较基础的阶段,还有许多细节需要完善,比如实现一致性哈希、当
 executor 数量发生变化时如何处理已有 cache 等,所以作者
还未向社区代码库提交 PR,想要提前预览的开发者
可以通过下面的链接查看源码:&lt;br /&gt;
+&lt;a 
href=&quot;https://github.com/zzcclp/kylin/commit/4e75b7fa4059dd2eaed24061fda7797fecaf2e35&quot;&gt;Kylin4.0
 本地缓存+软亲和性调度代码实现&lt;/a&gt;&lt;/p&gt;
+
+&lt;h2 id=&quot;section-7&quot;&gt;06 相关链接&lt;/h2&gt;
+&lt;p&gt;通过链接可查阅性能测试结果数据和具体配置:&lt;br 
/&gt;
+&lt;a 
href=&quot;https://github.com/Kyligence/kylin-tpch/issues/9&quot;&gt;Kylin4.0 
本地缓存+软亲和性调度测试&lt;/a&gt;&lt;/p&gt;
+</description>
+        <pubDate>Thu, 21 Oct 2021 04:00:00 -0700</pubDate>
+        
<link>http://kylin.apache.org/cn_blog/2021/10/21/Local-Cache-and-Soft-Affinity-Scheduling/</link>
+        <guid 
isPermaLink="true">http://kylin.apache.org/cn_blog/2021/10/21/Local-Cache-and-Soft-Affinity-Scheduling/</guid>
+        
+        
+        <category>cn_blog</category>
+        
+      </item>
+    
+      <item>
+        <title>Performance optimization of Kylin 4.0 in cloud -- local cache 
and soft affinity scheduling</title>
+        <description>&lt;h2 id=&quot;background-introduction&quot;&gt;01 
Background Introduction&lt;/h2&gt;
+&lt;p&gt;Recently, the Apache Kylin community released Kylin 4.0.0 with a new 
architecture. The architecture of Kylin 4.0 supports the separation of storage 
and computing, which enables kylin users to run Kylin 4.0 in a more flexible 
cloud deployment mode with flexible computing resources. With the cloud 
infrastructure, users can choose to use cheap and reliable object storage to 
store cube data, such as S3. However, in the architecture of separation of 
storage and computing, we need to consider that reading data from remote 
storage by computing nodes through the network is still a costly operation, 
which often leads to performance loss.&lt;br /&gt;
+In order to improve the query performance of Kylin 4.0 when using cloud object 
storage as the storage, we try to introduce the local cache mechanism into the 
Kylin 4.0 query engine. When executing the query, the frequently used data is 
cached on the local disk to reduce the delay caused by pulling data from the 
remote object storage and achieve faster query response. In addition, in order 
to avoid wasting disk space when the same data is cached on a large number of 
spark executors at the same time, and the computing node can read more required 
data from the local cache, we introduce the scheduling strategy of soft 
affinity. The soft affinity strategy is to establish a corresponding 
relationship between the spark executor and the data file through some method, 
In most cases, the same data can always be read on the same executor, so as to 
improve the hit rate of the cache.&lt;/p&gt;
+
+&lt;h2 id=&quot;implementation-principle&quot;&gt;02 Implementation 
Principle&lt;/h2&gt;
+
+&lt;h4 id=&quot;local-cache&quot;&gt;1. Local Cache&lt;/h4&gt;
+
+&lt;p&gt;When Kylin 4.0 executes a query, it mainly goes through the following 
stages, in which the stages where local cache can be used to improve 
performance are marked with dotted lines:&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;/images/blog/local-cache/Local_cache_stage.png&quot; alt=&quot;&quot; 
/&gt;&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;File list cache:Cache the file status on the spark driver side. 
When executing the query, the spark driver needs to read the file list and 
obtain some file information for subsequent scheduling execution. Here, the 
file status information will be cached locally to avoid frequent reading of 
remote file directories.&lt;/li&gt;
+  &lt;li&gt;Data cache:Cache the data on the spark executor side. You can 
set the data cache to memory or disk. If it is set to cache to memory, you need 
to appropriately increase the executor memory to ensure that the executor has 
enough memory for data cache; If it is cached to disk, you need to set the data 
cache directory, preferably SSD disk directory.&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;Based on the above design, different types of caches are made on the 
driver side and the executor side of the query engine of kylin 4.0. The basic 
architecture is as follows:&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;/images/blog/local-cache/kylin4_local_cache.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;h4 id=&quot;soft-affinity-scheduling&quot;&gt;2. Soft Affinity 
Scheduling&lt;/h4&gt;
+
+&lt;p&gt;When doing data cache on the executor side, if all data is cached on 
all executors, the size of cached data will be very considerable and a great 
waste of disk space, and it is easy to cause frequent evict cache data. In 
order to maximize the cache hit rate of the spark executor, the spark driver 
needs to schedule the tasks of the same file to the same executor as far as 
possible when the resource conditions are me, so as to ensure that the data of 
the same file can be cached on a specific one or several executors, and the 
data can be read through the cache when it is read again.&lt;br /&gt;
+To this end, we calculate the target executor list by calculating the hash 
according to the file name and then modulo with the executor num. The number of 
executors to cache is determined by the number of data cache replications 
configured by the user. Generally, the larger the number of cache replications, 
the higher the probability of hitting the cache. When the target executors are 
unreachable or have no resources for scheduling, the scheduler will fall back 
to the random scheduling mechanism of spark. This scheduling method is called 
soft affinity scheduling strategy. Although it can not guarantee 100% hit to 
the cache, it can effectively improve the cache hit rate and avoid a large 
amount of disk space wasted by full cache on the premise of minimizing 
performance loss.&lt;/p&gt;
+
+&lt;h2 id=&quot;related-configuration&quot;&gt;03 Related 
Configuration&lt;/h2&gt;
+
+&lt;p&gt;According to the above principles, we implemented the basic function 
of local cache + soft affinity scheduling in Kylin 4.0, and tested the query 
performance based on SSB data set and TPCH data set respectively.&lt;br /&gt;
+Several important configuration items are listed here for users to understand. 
The actual configuration will be given in the attachment at the end:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;Enable soft affinity 
scheduling:kylin.query.spark-conf.spark.kylin.soft-affinity.enabled&lt;/li&gt;
+  &lt;li&gt;Enable local 
cache:kylin.query.spark-conf.spark.hadoop.spark.kylin.local-cache.enabled&lt;/li&gt;
+  &lt;li&gt;The number of data cache replications, that is, how many executors 
cache the same data 
file:kylin.query.spark-conf.spark.kylin.soft-affinity.replications.num&lt;/li&gt;
+  &lt;li&gt;Cache to memory or local directory. Set cache to memory as buff 
and cache to local as local: 
kylin.query.spark-conf.spark.hadoop.alluxio.user.client.cache.store.type&lt;/li&gt;
+  &lt;li&gt;Maximum cache 
capacity:kylin.query.spark-conf.spark.hadoop.alluxio.user.client.cache.size&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;h2 id=&quot;performance-benchmark&quot;&gt;04 Performance 
Benchmark&lt;/h2&gt;
+
+&lt;p&gt;We conducted performance tests in three scenarios under AWS EMR 
environment. When scale factor = 10, we conducted single concurrent query test 
on SSB dataset, single concurrent query test and 4 concurrent query test on 
TPCH dataset. S3 was configured as storage in the experimental group and the 
control group. Local cache and soft affinity scheduling were enabled in the 
experimental group, but not in the control group. In addition, we also compare 
the results of the experimental group with the results when HDFS is used as 
storage in the same environment, so that users can intuitively feel the 
optimization effect of local cache + soft affinity scheduling on deploying 
Kylin 4.0 on the cloud and using object storage as storage.&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;/images/blog/local-cache/local_cache_benchmark_result_ssb.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;/images/blog/local-cache/local_cache_benchmark_result_tpch1.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;&lt;img 
src=&quot;/images/blog/local-cache/local_cache_benchmark_result_tpch4.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;As can be seen from the above results:&lt;/p&gt;
+
+&lt;ol&gt;
+  &lt;li&gt;In the single concurrency scenario of SSB data set, when S3 is 
used as storage, turning on the local cache and soft affinity scheduling can 
achieve about three times the performance improvement, which can be the same as 
that of HDFS, or even improved.&lt;/li&gt;
+  &lt;li&gt;Under TPCH data set, when S3 is used as storage, whether single 
concurrent query or multiple concurrent query, after local cache and soft 
affinity scheduling are enabled, the performance of all queries can be greatly 
improved.&lt;/li&gt;
+&lt;/ol&gt;
+
+&lt;p&gt;However, in the comparison results of Q21 under the 4 concurrent 
tests of TPCH dataset, we observed that the results of enabling local cache and 
soft affinity scheduling are lower than those when using S3 alone as storage. 
Here, it may be that the data is not read through the cache for some reason. 
The underlying reason is not further analyzed in this test, in the subsequent 
optimization process, we will gradually improve. Moreover, because the query of 
TPCH is complex and the SQL types are different, compared with the results of 
HDFS, the performance of some SQL is improved, while the performance of some 
SQL is slightly insufficient, but generally speaking, it is very close to the 
results of HDFS as storage.&lt;br /&gt;
+The result of this performance test is a preliminary verification of the 
performance improvement effect of local cache + soft affinity scheduling. On 
the whole, local cache + soft affinity scheduling can achieve significant 
performance improvement for both simple queries and complex queries, but there 
is a certain performance loss in the scenario of high concurrent queries.&lt;br 
/&gt;
+If users use cloud object storage as Kylin 4.0 storage, they can get a good 
performance experience when local cache + soft affinity scheduling is enabled, 
which provides performance guarantee for Kylin 4.0 to use the separation 
architecture of computing and storage in the cloud.&lt;/p&gt;
+
+&lt;h2 id=&quot;code-implementation&quot;&gt;05 Code Implementation&lt;/h2&gt;
+
+&lt;p&gt;Since the current code implementation is still in the basic stage, 
there are still many details to be improved, such as implementing consistent 
hash, how to deal with the existing cache when the number of executors changes, 
so the author has not submitted PR to the community code base. Developers who 
want to preview in advance can view the source code through the following 
link:&lt;/p&gt;
+
+&lt;p&gt;&lt;a 
href=&quot;https://github.com/zzcclp/kylin/commit/4e75b7fa4059dd2eaed24061fda7797fecaf2e35&quot;&gt;The
 code implementation of local cache and soft affinity 
scheduling&lt;/a&gt;&lt;/p&gt;
+
+&lt;h2 id=&quot;related-link&quot;&gt;06 Related Link&lt;/h2&gt;
+
+&lt;p&gt;You can view the performance test result data and specific 
configuration through the link:&lt;br /&gt;
+&lt;a href=&quot;https://github.com/Kyligence/kylin-tpch/issues/9&quot;&gt;The 
benchmark of Kylin4.0 with local cache and soft affinity 
scheduling&lt;/a&gt;&lt;/p&gt;
+</description>
+        <pubDate>Thu, 21 Oct 2021 04:00:00 -0700</pubDate>
+        
<link>http://kylin.apache.org/blog/2021/10/21/Local-Cache-and-Soft-Affinity-Scheduling/</link>
+        <guid 
isPermaLink="true">http://kylin.apache.org/blog/2021/10/21/Local-Cache-and-Soft-Affinity-Scheduling/</guid>
+        
+        
+        <category>blog</category>
+        
+      </item>
+    
+      <item>
         <title>Kylin 在美团到店餐饮的实践和优化</title>
         
<description>&lt;p&gt;从2016年开始,美团到店餐饮技术团队就开始使用Apache
 
Kylin作为OLAP引擎,但是随着业务的高速发展,在构建和查询层面都出现了效率问题。于是,技术团队从原理解读开始,然后对过程进行层层拆解,并制定了由点及面的实施路线。本文总结了一些经验和心得,希望能够帮助业界更多的技术团队提高数据的产出效率。&lt;/p&gt;
 
@@ -577,155 +728,6 @@ For example, a query joins two subquerie
       </item>
     
       <item>
-        <title>Why did Youzan choose Kylin4</title>
-        <description>&lt;p&gt;At the QCon Global Software Developers 
Conference held on May 29, 2021, Zheng Shengjun, head of Youzan’s data 
infrastructure platform, shared Youzan’s internal use experience and 
optimization practice of Kylin 4.0 on the meeting room of open source big data 
frameworks and applications. &lt;br /&gt;
-For many users of Kylin2/3(Kylin on HBase), this is also a chance to learn how 
and why to upgrade to Kylin 4.&lt;/p&gt;
-
-&lt;p&gt;This sharing is mainly divided into the following parts:&lt;/p&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;The reason for choosing Kylin 4&lt;/li&gt;
-  &lt;li&gt;Introduction to Kylin 4&lt;/li&gt;
-  &lt;li&gt;How to optimize performance of Kylin 4&lt;/li&gt;
-  &lt;li&gt;Practice of Kylin 4 in Youzan&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h2 id=&quot;the-reason-for-choosing-kylin-4&quot;&gt;01 The reason for 
choosing Kylin 4&lt;/h2&gt;
-
-&lt;h3 id=&quot;introduction-to-youzan&quot;&gt;Introduction to 
Youzan&lt;/h3&gt;
-&lt;p&gt;China Youzan Co., Ltd (stock code 08083.HK). is an enterprise mainly 
engaged in retail technology services.&lt;br /&gt;
-At present, it owns several tools and solutions to provide SaaS software 
products and talent services to help merchants operate mobile social e-commerce 
and new retail channels in an all-round way. &lt;br /&gt;
-Currently Youzan has hundreds of millions of consumers and 6 million existing 
merchants.&lt;/p&gt;
-
-&lt;h3 id=&quot;history-of-kylin-in-youzan&quot;&gt;History of Kylin in 
Youzan&lt;/h3&gt;
-&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/1 
history_of_youzan_OLAP.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;First of all, I would like to share why Youzan chose to upgrade to 
Kylin 4. Here, let me briefly reviewed the history of Youzan OLAP 
infra.&lt;/p&gt;
-
-&lt;p&gt;In the early days of Youzan, in order to iterate develop process 
quickly, we chose the method of pre-computation + MySQL; in 2018, Druid was 
introduced because of query flexibility and development efficiency, but there 
were problems such as low pre-aggregation, not supporting precisely count 
distinct measure. In this situation, Youzan introduced Apache Kylin and 
ClickHouse. Kylin supports high aggregation, precisely count distinct measure 
and the lowest RT, while ClickHouse is quite flexible in usage(ad hoc 
query).&lt;/p&gt;
-
-&lt;p&gt;From the introduction of Kylin in 2018 to now, Youzan has used Kylin 
for more than three years. With the continuous enrichment of business scenarios 
and the continuous accumulation of data volume, Youzan currently has 6 million 
existing merchants, GMV in 2020 is 107.3 billion, and the daily build data 
volume is 10 billion +. At present, Kylin has basically covered all the 
business scenarios of Youzan.&lt;/p&gt;
-
-&lt;h3 id=&quot;the-challenges-of-kylin-3&quot;&gt;The challenges of Kylin 
3&lt;/h3&gt;
-&lt;p&gt;With Youzan’s rapid development and in-depth use of Kylin, we also 
encountered some challenges:&lt;/p&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;First of all, the build performance of Kylin on HBase cannot meet 
the favorable expectations, and the build performance will affect the user’s 
failure recovery time and stability experience;&lt;/li&gt;
-  &lt;li&gt;Secondly, with the access of more large merchants (tens of 
millions of members in a single store, with hundreds of thousands of goods for 
each store), it also brings great challenges to our OLAP system. Kylin on HBase 
is limited by the single-point query of Query Server, and cannot support these 
complex scenarios well;&lt;/li&gt;
-  &lt;li&gt;Finally, because HBase is not a cloud-native system, it is 
difficult to achieve flexible scale up and scale down. With the continuous 
growth of data volume, this system has peaks and valleys for businesses, which 
results in the average resource utilization rate is not high enough.&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;p&gt;Faced with these challenges, Youzan chose to move closer and upgrade 
to the more cloud-native Apache Kylin 4.&lt;/p&gt;
-
-&lt;h2 id=&quot;introduction-to-kylin-4&quot;&gt;02 Introduction to Kylin 
4&lt;/h2&gt;
-&lt;p&gt;First of all, let’s introduce the main advantages of Kylin 4. 
Apache Kylin 4 completely depends on Spark for cubing job and query. It can 
make full use of Spark’s parallelization, quantization(向量化), and global 
dynamic code generation technologies to improve the efficiency of large 
queries.&lt;br /&gt;
-Here is a brief introduction to the principle of Kylin 4, that is storage 
engine, build engine and query engine.&lt;/p&gt;
-
-&lt;h3 id=&quot;storage-engine&quot;&gt;Storage engine&lt;/h3&gt;
-&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/2 kylin4_storage.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;First of all, let’s take a look at the new storage engine, 
comparison between Kylin on HBase and Kylin on Parquet. The cuboid data of 
Kylin on HBase is stored in the table of HBase. Single Segment corresponds to 
one HBase table. Aggregation is pushed down to HBase coprocessor.&lt;/p&gt;
-
-&lt;p&gt;But as we know,  HBase is not a real Columnar Storage and its 
throughput is not enough for OLAP System. Kylin 4 replaces HBase with Parquet, 
all the data is stored in files. Each segment will have a corresponding HDFS 
directory. All queries and cubing jobs read and write files without HBase . 
Although there will be a certain loss of performance for simple queries, the 
improvement brought about by complex queries is more considerable and 
worthwhile.&lt;/p&gt;
-
-&lt;h3 id=&quot;build-engine&quot;&gt;Build engine&lt;/h3&gt;
-&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/3 kylin4_build_engine.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;The second is the new build engine. Based on our test, the build 
speed of Kylin on Parquet has been optimized from 82 minutes to 15 minutes. 
There are several reasons:&lt;/p&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;Kylin 4 removes the encoding of the dimension, eliminating a 
building step of encoding;&lt;/li&gt;
-  &lt;li&gt;Removed the HBase File generation step;&lt;/li&gt;
-  &lt;li&gt;Kylin on Parquet changes the granularity of cubing to cuboid 
level, which is conducive to further improving parallelism of cubing 
job.&lt;/li&gt;
-  &lt;li&gt;Enhanced implementation for global dictionary. In the new 
algorithm, dictionary and source data are hashed into the same buckets, making 
it possible for loading only piece of dictionary bucket to encode source 
data.&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;p&gt;As you can see on the right, after upgradation to Kylin 4, cubing job 
changes from ten steps to two steps, the performance improvement of the 
construction is very obvious.&lt;/p&gt;
-
-&lt;h3 id=&quot;query-engine&quot;&gt;Query engine&lt;/h3&gt;
-&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/4 kylin4_query.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;Next is the new query engine of Kylin 4. As you can see, the 
calculation of Kylin on HBase is completely dependent on the coprocessor of 
HBase and query server process. When the data is read from HBase into query 
server to do aggregation, sorting, etc, the bottleneck will be restricted by 
the single point of query server. But Kylin 4 is converted to a fully 
distributed query mechanism based on Spark, what’s more, it ‘s able to do 
configuration tuning automatically in spark query step !&lt;/p&gt;
-
-&lt;h2 id=&quot;how-to-optimize-performance-of-kylin-4&quot;&gt;03 How to 
optimize performance of Kylin 4&lt;/h2&gt;
-&lt;p&gt;Next, I’d like to share some performance optimizations made by 
Youzan in Kylin 4.&lt;/p&gt;
-
-&lt;h3 id=&quot;optimization-of-query-engine&quot;&gt;Optimization of query 
engine&lt;/h3&gt;
-&lt;p&gt;#### 1.Cache Calcite physical plan&lt;br /&gt;
-&lt;img src=&quot;/images/blog/youzan/5 cache_calcite_plan.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;In Kylin4, SQL will be analyzed, optimized and do code generation in 
calcite. This step takes up about 150ms for some queries. We have supported 
PreparedStatementCache in Kylin4 to cache calcite plan, so that the structured 
SQL don’t have to do the same step again. With this optimization it saved 
about 150ms of time cost.&lt;/p&gt;
-
-&lt;h4 id=&quot;tunning-spark-configuration&quot;&gt;2.Tunning spark 
configuration&lt;/h4&gt;
-&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/6 
tuning_spark_configuration.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;Kylin4 uses spark as query engine. As spark is a distributed engine 
designed for massive data processing, it’s inevitable to loose some 
performance for small queries. We have tried to do some tuning to catch up with 
the latency in Kylin on HBase for small queries.&lt;/p&gt;
-
-&lt;p&gt;Our first optimization is to make more calculations finish in memory. 
The key is to avoid data spill during aggregation, shuffle and sort. Tuning the 
following configuration is helpful.&lt;/p&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;1.set &lt;code 
class=&quot;highlighter-rouge&quot;&gt;spark.sql.objectHashAggregate.sortBased.fallbackThreshold&lt;/code&gt;
 to larger value to avoid HashAggregate fall back to Sort Based Aggregate, 
which really kills performance when happens.&lt;/li&gt;
-  &lt;li&gt;2.set &lt;code 
class=&quot;highlighter-rouge&quot;&gt;spark.shuffle.spill.initialMemoryThreshold&lt;/code&gt;
 to a large value to avoid to many spills during shuffle.&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;p&gt;Secondly, we route small queries to Query Server which run spark in 
local mode. Because the overhead of task schedule, shuffle read and variable 
broadcast is enlarged for small queries on YARN/Standalone mode.&lt;/p&gt;
-
-&lt;p&gt;Thirdly, we use RAM disk to enhance shuffle performance. Mount RAM 
disk as TMPFS and set spark.local.dir to directory using RAM disk.&lt;/p&gt;
-
-&lt;p&gt;Lastly, we disabled spark’s whole stage code generation for small 
queries, for spark’s whole stage code generation will cost about 100ms~200ms, 
whereas it’s not beneficial to small queries which is a simple 
project.&lt;/p&gt;
-
-&lt;h4 id=&quot;parquet-optimization&quot;&gt;3.Parquet optimization&lt;/h4&gt;
-&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/7 
parquet_optimization.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;Optimizing parquet is also important for queries.&lt;/p&gt;
-
-&lt;p&gt;The first principal is that we’d better always include shard by 
column in our filter condition, for parquet files are shard by shard-by-column, 
filter using shard by column reduces the data files to read.&lt;/p&gt;
-
-&lt;p&gt;Then look into parquet files, data within files are sorted by rowkey 
columns, that is to say, prefix match in query is as important as Kylin on 
HBase. When a query condition satisfies prefix match, it can filter row groups 
with column’s max/min index. Furthermore, we can reduce row group size to 
make finer index granularity, but be aware that the compression rate will be 
lower if we set row group size smaller.&lt;/p&gt;
-
-&lt;h4 
id=&quot;dynamic-elimination-of-partitioning-dimensions&quot;&gt;4.Dynamic 
elimination of partitioning dimensions&lt;/h4&gt;
-&lt;p&gt;Kylin4 have a new ability that the older version is not capable of, 
which is able to reduce dozens of times of data reading and computing for some 
big queries. It’s offen the case that partition column is used to filter data 
but not used as group dimension. For those cases Kylin would always choose 
cuboid with partition column, but now it is able to use different cuboid in 
that query to reduce IO read and computing.&lt;/p&gt;
-
-&lt;p&gt;The key of this optimization is to split a query into two parts, one 
of the part uses all segment’s data so that partition column doesn’t have 
to be included in cuboid, the other part that uses part of segments data will 
choose cuboid with partition dimension to do the data filter.&lt;/p&gt;
-
-&lt;p&gt;We have tested that in some situations the response time reduced from 
20s to 6s, 10s to 3s.&lt;/p&gt;
-
-&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/8 
Dynamic_elimination_of_partitioning_dimensions.png&quot; alt=&quot;&quot; 
/&gt;&lt;/p&gt;
-
-&lt;h3 id=&quot;optimization-of-build-engine&quot;&gt;Optimization of build 
engine&lt;/h3&gt;
-&lt;p&gt;#### 1.cache parent dataset&lt;br /&gt;
-&lt;img src=&quot;/images/blog/youzan/9 cache_parent_dataset.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;Kylin build cube layer by layer. For a parent layer with multi 
cuboids to build, we can choose to cache parent dataset by setting 
kylin.engine.spark.parent-dataset.max.persist.count to a number greater than 0. 
But notice that if you set this value too small, it will affect the parallelism 
of build job, as the build granularity is at cuboid level.&lt;/p&gt;
-
-&lt;h2 id=&quot;practice-of-kylin-4-in-youzan&quot;&gt;04 Practice of Kylin 4 
in Youzan&lt;/h2&gt;
-&lt;p&gt;After introducing Youzan’s experience of performance optimization, 
let’s share the optimization effect. That is, Kylin 4’s practice in Youzan 
includes the upgrade process and the performance of online system.&lt;/p&gt;
-
-&lt;h3 id=&quot;upgrade-metadata-to-adapt-to-kylin-4&quot;&gt;Upgrade metadata 
to adapt to Kylin 4&lt;/h3&gt;
-&lt;p&gt;First of all, for metadata for Kylin 3 which stored on HBase, we have 
developed a tool for seamless upgrading of metadata. First of all, our metadata 
in Kylin on HBase is stored in HBase. We export the metadata in HBase into 
local files, and then use tools to transform and write back the new metadata 
into MySQL. We also updated the operation documents and general principles in 
the official wiki of Apache Kylin. For more details, you can refer to: &lt;a 
href=&quot;https://wiki.apache.org/confluence/display/KYLIN/How+to+migrate+metadata+to+Kylin+4&quot;&gt;How
 to migrate metadata to Kylin 4&lt;/a&gt;.&lt;/p&gt;
-
-&lt;p&gt;Let’s give a general introduction to some compatibility in the 
whole process. The project metadata, tables metadata, permission-related 
metadata, and model metadata do not need be modified. What needs to be modified 
is the cube metadata, including the type of storage and query used by Cube. 
After updating these two fields, you need to recalculate the Cube signature. 
The function of this signature is designed internally by Kylin to avoid some 
problems caused by Cube after Cube is determined.&lt;/p&gt;
-
-&lt;h3 
id=&quot;performance-of-kylin-4-on-youzan-online-system&quot;&gt;Performance of 
Kylin 4 on Youzan online system&lt;/h3&gt;
-&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/10 commodity_insight.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;After the migration of metadata to Kylin4, let’s share the 
qualitative changes and substantial performance improvements brought about by 
some of the promising scenarios. First of all, in a scenario like Commodity 
Insight, there is a large store with several hundred thousand of commodities. 
We have to analyze its transactions and traffic, etc. There are more than a 
dozen precise precisely count distinct measures in single cube. Precisely count 
distinct measure is actually very inefficient if it is not optimized through 
pre-calculation and Bitmap. Kylin currently uses Bitmap to support precisely 
count distinct measure. In a scene that requires complex queries to sort 
hundreds of thousands of commodities in various UV(precisely count distinct 
measure), the RT of Kylin 2 is 27 seconds, while the RT of Kylin 4 is reduced 
from 27 seconds to less than 2 seconds.&lt;/p&gt;
-
-&lt;p&gt;What I find most appealing to me about Kylin 4 is that it’s like a 
manual transmission car, you can control its query concurrency at your will, 
whereas you can’t change query concurrency in Kylin on HBase freely, because 
its concurrency is completely tied to the number of regions.&lt;/p&gt;
-
-&lt;h3 id=&quot;plan-for-kylin-4-in-youzan&quot;&gt;Plan for Kylin 4 in 
Youzan&lt;/h3&gt;
-&lt;p&gt;We have made full test, fixed several bugs and improved apache KYLIN4 
for several months. Now we are migrating cubes from older version to newer 
version. For the cubes already migrated to KYLIN4, its small queries’ 
performance meet our expectations, its complex query and build performance did 
bring us a big surprise. We are planning to migrate all cubes from older 
version to Kylin4.&lt;/p&gt;
-</description>
-        <pubDate>Thu, 17 Jun 2021 08:00:00 -0700</pubDate>
-        
<link>http://kylin.apache.org/blog/2021/06/17/Why-did-Youzan-choose-Kylin4/</link>
-        <guid 
isPermaLink="true">http://kylin.apache.org/blog/2021/06/17/Why-did-Youzan-choose-Kylin4/</guid>
-        
-        
-        <category>blog</category>
-        
-      </item>
-    
-      <item>
         <title>有赞为什么选择 Kylin4</title>
         <description>&lt;p&gt;在 2021å¹´5月29日举办的 QCon å…
¨çƒè½¯ä»¶å¼€å‘者大会上,来自有赞的数据基础平台负责人 
郑生俊 在大数据开源框架与应用专题上分享了有赞内部对 
Kylin 4.0 的使用经历和优化实践,对于众多 Kylin 
老用户来说,这也是升级 Kylin 4 的实用攻略。&lt;/p&gt;
 
@@ -885,6 +887,155 @@ Here is a brief introduction to the prin
       </item>
     
       <item>
+        <title>Why did Youzan choose Kylin4</title>
+        <description>&lt;p&gt;At the QCon Global Software Developers 
Conference held on May 29, 2021, Zheng Shengjun, head of Youzan’s data 
infrastructure platform, shared Youzan’s internal use experience and 
optimization practice of Kylin 4.0 on the meeting room of open source big data 
frameworks and applications. &lt;br /&gt;
+For many users of Kylin2/3(Kylin on HBase), this is also a chance to learn how 
and why to upgrade to Kylin 4.&lt;/p&gt;
+
+&lt;p&gt;This sharing is mainly divided into the following parts:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;The reason for choosing Kylin 4&lt;/li&gt;
+  &lt;li&gt;Introduction to Kylin 4&lt;/li&gt;
+  &lt;li&gt;How to optimize performance of Kylin 4&lt;/li&gt;
+  &lt;li&gt;Practice of Kylin 4 in Youzan&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;h2 id=&quot;the-reason-for-choosing-kylin-4&quot;&gt;01 The reason for 
choosing Kylin 4&lt;/h2&gt;
+
+&lt;h3 id=&quot;introduction-to-youzan&quot;&gt;Introduction to 
Youzan&lt;/h3&gt;
+&lt;p&gt;China Youzan Co., Ltd (stock code 08083.HK). is an enterprise mainly 
engaged in retail technology services.&lt;br /&gt;
+At present, it owns several tools and solutions to provide SaaS software 
products and talent services to help merchants operate mobile social e-commerce 
and new retail channels in an all-round way. &lt;br /&gt;
+Currently Youzan has hundreds of millions of consumers and 6 million existing 
merchants.&lt;/p&gt;
+
+&lt;h3 id=&quot;history-of-kylin-in-youzan&quot;&gt;History of Kylin in 
Youzan&lt;/h3&gt;
+&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/1 
history_of_youzan_OLAP.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;First of all, I would like to share why Youzan chose to upgrade to 
Kylin 4. Here, let me briefly reviewed the history of Youzan OLAP 
infra.&lt;/p&gt;
+
+&lt;p&gt;In the early days of Youzan, in order to iterate develop process 
quickly, we chose the method of pre-computation + MySQL; in 2018, Druid was 
introduced because of query flexibility and development efficiency, but there 
were problems such as low pre-aggregation, not supporting precisely count 
distinct measure. In this situation, Youzan introduced Apache Kylin and 
ClickHouse. Kylin supports high aggregation, precisely count distinct measure 
and the lowest RT, while ClickHouse is quite flexible in usage(ad hoc 
query).&lt;/p&gt;
+
+&lt;p&gt;From the introduction of Kylin in 2018 to now, Youzan has used Kylin 
for more than three years. With the continuous enrichment of business scenarios 
and the continuous accumulation of data volume, Youzan currently has 6 million 
existing merchants, GMV in 2020 is 107.3 billion, and the daily build data 
volume is 10 billion +. At present, Kylin has basically covered all the 
business scenarios of Youzan.&lt;/p&gt;
+
+&lt;h3 id=&quot;the-challenges-of-kylin-3&quot;&gt;The challenges of Kylin 
3&lt;/h3&gt;
+&lt;p&gt;With Youzan’s rapid development and in-depth use of Kylin, we also 
encountered some challenges:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;First of all, the build performance of Kylin on HBase cannot meet 
the favorable expectations, and the build performance will affect the user’s 
failure recovery time and stability experience;&lt;/li&gt;
+  &lt;li&gt;Secondly, with the access of more large merchants (tens of 
millions of members in a single store, with hundreds of thousands of goods for 
each store), it also brings great challenges to our OLAP system. Kylin on HBase 
is limited by the single-point query of Query Server, and cannot support these 
complex scenarios well;&lt;/li&gt;
+  &lt;li&gt;Finally, because HBase is not a cloud-native system, it is 
difficult to achieve flexible scale up and scale down. With the continuous 
growth of data volume, this system has peaks and valleys for businesses, which 
results in the average resource utilization rate is not high enough.&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;Faced with these challenges, Youzan chose to move closer and upgrade 
to the more cloud-native Apache Kylin 4.&lt;/p&gt;
+
+&lt;h2 id=&quot;introduction-to-kylin-4&quot;&gt;02 Introduction to Kylin 
4&lt;/h2&gt;
+&lt;p&gt;First of all, let’s introduce the main advantages of Kylin 4. 
Apache Kylin 4 completely depends on Spark for cubing job and query. It can 
make full use of Spark’s parallelization, quantization(向量化), and global 
dynamic code generation technologies to improve the efficiency of large 
queries.&lt;br /&gt;
+Here is a brief introduction to the principle of Kylin 4, that is storage 
engine, build engine and query engine.&lt;/p&gt;
+
+&lt;h3 id=&quot;storage-engine&quot;&gt;Storage engine&lt;/h3&gt;
+&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/2 kylin4_storage.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;First of all, let’s take a look at the new storage engine, 
comparison between Kylin on HBase and Kylin on Parquet. The cuboid data of 
Kylin on HBase is stored in the table of HBase. Single Segment corresponds to 
one HBase table. Aggregation is pushed down to HBase coprocessor.&lt;/p&gt;
+
+&lt;p&gt;But as we know,  HBase is not a real Columnar Storage and its 
throughput is not enough for OLAP System. Kylin 4 replaces HBase with Parquet, 
all the data is stored in files. Each segment will have a corresponding HDFS 
directory. All queries and cubing jobs read and write files without HBase . 
Although there will be a certain loss of performance for simple queries, the 
improvement brought about by complex queries is more considerable and 
worthwhile.&lt;/p&gt;
+
+&lt;h3 id=&quot;build-engine&quot;&gt;Build engine&lt;/h3&gt;
+&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/3 kylin4_build_engine.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;The second is the new build engine. Based on our test, the build 
speed of Kylin on Parquet has been optimized from 82 minutes to 15 minutes. 
There are several reasons:&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;Kylin 4 removes the encoding of the dimension, eliminating a 
building step of encoding;&lt;/li&gt;
+  &lt;li&gt;Removed the HBase File generation step;&lt;/li&gt;
+  &lt;li&gt;Kylin on Parquet changes the granularity of cubing to cuboid 
level, which is conducive to further improving parallelism of cubing 
job.&lt;/li&gt;
+  &lt;li&gt;Enhanced implementation for global dictionary. In the new 
algorithm, dictionary and source data are hashed into the same buckets, making 
it possible for loading only piece of dictionary bucket to encode source 
data.&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;As you can see on the right, after upgradation to Kylin 4, cubing job 
changes from ten steps to two steps, the performance improvement of the 
construction is very obvious.&lt;/p&gt;
+
+&lt;h3 id=&quot;query-engine&quot;&gt;Query engine&lt;/h3&gt;
+&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/4 kylin4_query.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;Next is the new query engine of Kylin 4. As you can see, the 
calculation of Kylin on HBase is completely dependent on the coprocessor of 
HBase and query server process. When the data is read from HBase into query 
server to do aggregation, sorting, etc, the bottleneck will be restricted by 
the single point of query server. But Kylin 4 is converted to a fully 
distributed query mechanism based on Spark, what’s more, it ‘s able to do 
configuration tuning automatically in spark query step !&lt;/p&gt;
+
+&lt;h2 id=&quot;how-to-optimize-performance-of-kylin-4&quot;&gt;03 How to 
optimize performance of Kylin 4&lt;/h2&gt;
+&lt;p&gt;Next, I’d like to share some performance optimizations made by 
Youzan in Kylin 4.&lt;/p&gt;
+
+&lt;h3 id=&quot;optimization-of-query-engine&quot;&gt;Optimization of query 
engine&lt;/h3&gt;
+&lt;p&gt;#### 1.Cache Calcite physical plan&lt;br /&gt;
+&lt;img src=&quot;/images/blog/youzan/5 cache_calcite_plan.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;In Kylin4, SQL will be analyzed, optimized and do code generation in 
calcite. This step takes up about 150ms for some queries. We have supported 
PreparedStatementCache in Kylin4 to cache calcite plan, so that the structured 
SQL don’t have to do the same step again. With this optimization it saved 
about 150ms of time cost.&lt;/p&gt;
+
+&lt;h4 id=&quot;tunning-spark-configuration&quot;&gt;2.Tunning spark 
configuration&lt;/h4&gt;
+&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/6 
tuning_spark_configuration.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;Kylin4 uses spark as query engine. As spark is a distributed engine 
designed for massive data processing, it’s inevitable to loose some 
performance for small queries. We have tried to do some tuning to catch up with 
the latency in Kylin on HBase for small queries.&lt;/p&gt;
+
+&lt;p&gt;Our first optimization is to make more calculations finish in memory. 
The key is to avoid data spill during aggregation, shuffle and sort. Tuning the 
following configuration is helpful.&lt;/p&gt;
+
+&lt;ul&gt;
+  &lt;li&gt;1.set &lt;code 
class=&quot;highlighter-rouge&quot;&gt;spark.sql.objectHashAggregate.sortBased.fallbackThreshold&lt;/code&gt;
 to larger value to avoid HashAggregate fall back to Sort Based Aggregate, 
which really kills performance when happens.&lt;/li&gt;
+  &lt;li&gt;2.set &lt;code 
class=&quot;highlighter-rouge&quot;&gt;spark.shuffle.spill.initialMemoryThreshold&lt;/code&gt;
 to a large value to avoid to many spills during shuffle.&lt;/li&gt;
+&lt;/ul&gt;
+
+&lt;p&gt;Secondly, we route small queries to Query Server which run spark in 
local mode. Because the overhead of task schedule, shuffle read and variable 
broadcast is enlarged for small queries on YARN/Standalone mode.&lt;/p&gt;
+
+&lt;p&gt;Thirdly, we use RAM disk to enhance shuffle performance. Mount RAM 
disk as TMPFS and set spark.local.dir to directory using RAM disk.&lt;/p&gt;
+
+&lt;p&gt;Lastly, we disabled spark’s whole stage code generation for small 
queries, for spark’s whole stage code generation will cost about 100ms~200ms, 
whereas it’s not beneficial to small queries which is a simple 
project.&lt;/p&gt;
+
+&lt;h4 id=&quot;parquet-optimization&quot;&gt;3.Parquet optimization&lt;/h4&gt;
+&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/7 
parquet_optimization.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;Optimizing parquet is also important for queries.&lt;/p&gt;
+
+&lt;p&gt;The first principal is that we’d better always include shard by 
column in our filter condition, for parquet files are shard by shard-by-column, 
filter using shard by column reduces the data files to read.&lt;/p&gt;
+
+&lt;p&gt;Then look into parquet files, data within files are sorted by rowkey 
columns, that is to say, prefix match in query is as important as Kylin on 
HBase. When a query condition satisfies prefix match, it can filter row groups 
with column’s max/min index. Furthermore, we can reduce row group size to 
make finer index granularity, but be aware that the compression rate will be 
lower if we set row group size smaller.&lt;/p&gt;
+
+&lt;h4 
id=&quot;dynamic-elimination-of-partitioning-dimensions&quot;&gt;4.Dynamic 
elimination of partitioning dimensions&lt;/h4&gt;
+&lt;p&gt;Kylin4 have a new ability that the older version is not capable of, 
which is able to reduce dozens of times of data reading and computing for some 
big queries. It’s offen the case that partition column is used to filter data 
but not used as group dimension. For those cases Kylin would always choose 
cuboid with partition column, but now it is able to use different cuboid in 
that query to reduce IO read and computing.&lt;/p&gt;
+
+&lt;p&gt;The key of this optimization is to split a query into two parts, one 
of the part uses all segment’s data so that partition column doesn’t have 
to be included in cuboid, the other part that uses part of segments data will 
choose cuboid with partition dimension to do the data filter.&lt;/p&gt;
+
+&lt;p&gt;We have tested that in some situations the response time reduced from 
20s to 6s, 10s to 3s.&lt;/p&gt;
+
+&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/8 
Dynamic_elimination_of_partitioning_dimensions.png&quot; alt=&quot;&quot; 
/&gt;&lt;/p&gt;
+
+&lt;h3 id=&quot;optimization-of-build-engine&quot;&gt;Optimization of build 
engine&lt;/h3&gt;
+&lt;p&gt;#### 1.cache parent dataset&lt;br /&gt;
+&lt;img src=&quot;/images/blog/youzan/9 cache_parent_dataset.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;Kylin build cube layer by layer. For a parent layer with multi 
cuboids to build, we can choose to cache parent dataset by setting 
kylin.engine.spark.parent-dataset.max.persist.count to a number greater than 0. 
But notice that if you set this value too small, it will affect the parallelism 
of build job, as the build granularity is at cuboid level.&lt;/p&gt;
+
+&lt;h2 id=&quot;practice-of-kylin-4-in-youzan&quot;&gt;04 Practice of Kylin 4 
in Youzan&lt;/h2&gt;
+&lt;p&gt;After introducing Youzan’s experience of performance optimization, 
let’s share the optimization effect. That is, Kylin 4’s practice in Youzan 
includes the upgrade process and the performance of online system.&lt;/p&gt;
+
+&lt;h3 id=&quot;upgrade-metadata-to-adapt-to-kylin-4&quot;&gt;Upgrade metadata 
to adapt to Kylin 4&lt;/h3&gt;
+&lt;p&gt;First of all, for metadata for Kylin 3 which stored on HBase, we have 
developed a tool for seamless upgrading of metadata. First of all, our metadata 
in Kylin on HBase is stored in HBase. We export the metadata in HBase into 
local files, and then use tools to transform and write back the new metadata 
into MySQL. We also updated the operation documents and general principles in 
the official wiki of Apache Kylin. For more details, you can refer to: &lt;a 
href=&quot;https://wiki.apache.org/confluence/display/KYLIN/How+to+migrate+metadata+to+Kylin+4&quot;&gt;How
 to migrate metadata to Kylin 4&lt;/a&gt;.&lt;/p&gt;
+
+&lt;p&gt;Let’s give a general introduction to some compatibility in the 
whole process. The project metadata, tables metadata, permission-related 
metadata, and model metadata do not need be modified. What needs to be modified 
is the cube metadata, including the type of storage and query used by Cube. 
After updating these two fields, you need to recalculate the Cube signature. 
The function of this signature is designed internally by Kylin to avoid some 
problems caused by Cube after Cube is determined.&lt;/p&gt;
+
+&lt;h3 
id=&quot;performance-of-kylin-4-on-youzan-online-system&quot;&gt;Performance of 
Kylin 4 on Youzan online system&lt;/h3&gt;
+&lt;p&gt;&lt;img src=&quot;/images/blog/youzan/10 commodity_insight.png&quot; 
alt=&quot;&quot; /&gt;&lt;/p&gt;
+
+&lt;p&gt;After the migration of metadata to Kylin4, let’s share the 
qualitative changes and substantial performance improvements brought about by 
some of the promising scenarios. First of all, in a scenario like Commodity 
Insight, there is a large store with several hundred thousand of commodities. 
We have to analyze its transactions and traffic, etc. There are more than a 
dozen precise precisely count distinct measures in single cube. Precisely count 
distinct measure is actually very inefficient if it is not optimized through 
pre-calculation and Bitmap. Kylin currently uses Bitmap to support precisely 
count distinct measure. In a scene that requires complex queries to sort 
hundreds of thousands of commodities in various UV(precisely count distinct 
measure), the RT of Kylin 2 is 27 seconds, while the RT of Kylin 4 is reduced 
from 27 seconds to less than 2 seconds.&lt;/p&gt;
+
+&lt;p&gt;What I find most appealing to me about Kylin 4 is that it’s like a 
manual transmission car, you can control its query concurrency at your will, 
whereas you can’t change query concurrency in Kylin on HBase freely, because 
its concurrency is completely tied to the number of regions.&lt;/p&gt;
+
+&lt;h3 id=&quot;plan-for-kylin-4-in-youzan&quot;&gt;Plan for Kylin 4 in 
Youzan&lt;/h3&gt;
+&lt;p&gt;We have made full test, fixed several bugs and improved apache KYLIN4 
for several months. Now we are migrating cubes from older version to newer 
version. For the cubes already migrated to KYLIN4, its small queries’ 
performance meet our expectations, its complex query and build performance did 
bring us a big surprise. We are planning to migrate all cubes from older 
version to Kylin4.&lt;/p&gt;
+</description>
+        <pubDate>Thu, 17 Jun 2021 08:00:00 -0700</pubDate>
+        
<link>http://kylin.apache.org/blog/2021/06/17/Why-did-Youzan-choose-Kylin4/</link>
+        <guid 
isPermaLink="true">http://kylin.apache.org/blog/2021/06/17/Why-did-Youzan-choose-Kylin4/</guid>
+        
+        
+        <category>blog</category>
+        
+      </item>
+    
+      <item>
         <title>你离可视化酷炫大屏只差一套 Kylin + Davinci</title>
         <description>&lt;p&gt;Kylin 提供与 BI 工具的整合能力,如 
Tableau,PowerBI/Excel,MSTR,QlikSense,Hue 和 
SuperSet。但就可视化工具而言,Davinci 
良好的交互性和个性化的可视化大屏展现效果,使其与 Kylin 
的结合能让大部分用户有更好的可视化分析体验。&lt;/p&gt;
 
@@ -1030,730 +1181,6 @@ You should be able to see the tables/cub
         
         
         <category>blog</category>
-        
-      </item>
-    
-      <item>
-        <title>Detailed Analysis of refine query cache</title>
-        <description>&lt;hr /&gt;
-
-&lt;h2 id=&quot;part-i-basic-introduction&quot;&gt;Part-I Basic 
Introduction&lt;/h2&gt;
-
-&lt;h3 id=&quot;backgroud&quot;&gt;Backgroud&lt;/h3&gt;
-&lt;p&gt;In the past, query cache are not efficiently used in Kylin due to two 
aspects: &lt;strong&gt;coarse-grained cache expiration strategy&lt;/strong&gt; 
and &lt;strong&gt;lack of external cache&lt;/strong&gt;. Because of the 
aggressive cache expiration strategy, useful caches are often cleaned up 
unnecessarily. Because query caches are stored in local servers, they cannot be 
shared between servers. And because of the size limitation of local cache, not 
all useful query results can be cached.&lt;/p&gt;
-
-&lt;p&gt;To deal with these shortcomings, we change the query cache expiration 
strategy by signature checking and introduce the memcached as Kylin’s 
distributed cache so that Kylin servers are able to share cache between 
servers. And it’s easy to add memcached servers to scale out distributed 
cache.&lt;/p&gt;
-
-&lt;p&gt;These features is proposed and developed by eBay Kylin team. Thanks 
so much for their contribution.&lt;/p&gt;
-
-&lt;h3 id=&quot;related-jira&quot;&gt;Related JIRA&lt;/h3&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/KYLIN-2895&quot;&gt;KYLIN-2895 
Refine Query Cache&lt;/a&gt;
-    &lt;ul&gt;
-      &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/KYLIN-2899&quot;&gt;KYLIN-2899 
Introduce segment level query cache&lt;/a&gt;&lt;/li&gt;
-      &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/KYLIN-2898&quot;&gt;KYLIN-2898 
Introduce memcached as a distributed cache for queries&lt;/a&gt;&lt;/li&gt;
-      &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/KYLIN-2894&quot;&gt;KYLIN-2894 
Change the query cache expiration strategy by signature 
checking&lt;/a&gt;&lt;/li&gt;
-      &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/KYLIN-2897&quot;&gt;KYLIN-2897 
Improve the query execution for a set of duplicate queries in a short 
period&lt;/a&gt;&lt;/li&gt;
-      &lt;li&gt;&lt;a 
href=&quot;https://issues.apache.org/jira/browse/KYLIN-2896&quot;&gt;KYLIN-2896 
Refine query exception cache&lt;/a&gt;&lt;/li&gt;
-    &lt;/ul&gt;
-  &lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;hr /&gt;
-
-&lt;h2 id=&quot;part-ii-deep-dive&quot;&gt;Part-II Deep Dive&lt;/h2&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;Introduce memcached as a Distributed Query Cache&lt;/li&gt;
-  &lt;li&gt;Segment Level Cache&lt;/li&gt;
-  &lt;li&gt;Query Cache Expiration Strategy by Signature Checking&lt;/li&gt;
-  &lt;li&gt;Other Enhancement&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h3 
id=&quot;introduce-memcached-as-a-distributed-query-cache&quot;&gt;Introduce 
memcached as a Distributed Query Cache&lt;/h3&gt;
-
-&lt;p&gt;&lt;strong&gt;Memcached&lt;/strong&gt; is a Free and open source, 
high-performance, distributed memory object caching system. It is an in-memory 
key-value store for small chunks of arbitrary data (strings, objects) from 
results of database calls, API calls, or page rendering. It is simple yet 
powerful. Its simple design promotes quick deployment, ease of development, and 
solves many problems facing large data caches. Its API is available for most 
popular languages.&lt;/p&gt;
-
-&lt;p&gt;By KYLIN-2898, Kylin use &lt;strong&gt;Memcached&lt;/strong&gt; as 
distributed cache service, and use &lt;strong&gt;EhCache&lt;/strong&gt; as 
local cache service. When &lt;code 
class=&quot;highlighter-rouge&quot;&gt;RemoteLocalFailOverCacheManager&lt;/code&gt;
 is configured in &lt;code 
class=&quot;highlighter-rouge&quot;&gt;applicationContext.xml&lt;/code&gt;, for 
each cache put/get action, Kylin will first check if remote cache service is 
available, only if remote cache service is unavailable, local cache service 
will be used.&lt;/p&gt;
-
-&lt;p&gt;Firstly, multi query server can share query cache. For each kylin 
server, less jvm memory will be occupied which help to reduce GC pressure. 
Secondly, since memcached is centralized so duplicated cache entry will avoid 
in serval Kylin process. Thirdly, memcached has larger size and easy to scale 
out, this will help to reduce the chance which useful cache entry have to be 
dropped due to limited memory capacity.&lt;/p&gt;
-
-&lt;p&gt;To handle node failure and to scale out memcached cluster, author has 
introduced a consistent hash strategy to smoothly solve such problem. Ketama is 
an implementation of a consistent hashing algorithm, meaning you can add or 
remove servers from the memcached pool without causing a complete remap of all 
keys. Detail could be checked at &lt;a 
href=&quot;https://www.last.fm/user/RJ/journal/2007/04/10/rz_libketama_-_a_consistent_hashing_algo_for_memcache_clients&quot;&gt;Ketama
 consistent hash strategy&lt;/a&gt;.&lt;/p&gt;
-
-&lt;p&gt;&lt;img 
src=&quot;/images/blog/refine-query-cache/consistent-hashing.png&quot; 
alt=&quot;consistent hashing&quot; /&gt;&lt;/p&gt;
-
-&lt;h3 id=&quot;segment-level-cache&quot;&gt;Segment level Cache&lt;/h3&gt;
-
-&lt;p&gt;Currently Kylin use sql as the cache key, when sql comes, if result 
exists in the cache, it will directly returned the cached result and don’t 
need to query hbase. When there is new segment build or existing segment 
refresh, all related cache result need to be evicted. For some frequently build 
cube such as streaming cube(NRT Streaming or Real-time OLAP), the cache miss 
will increase dramatically, that may decrease the query performance.&lt;/p&gt;
-
-&lt;p&gt;Since for Kylin cube, most historical segments are immutable, the 
same query against historical segments should be always same, don’t need to 
be evicted for new segment building. So we decide to implement the segment 
level cache, it is a complement of the existing front-end cache, the idea is 
similar as the level1/level2 cache in operating system.&lt;/p&gt;
-
-&lt;p&gt;&lt;img 
src=&quot;/images/blog/refine-query-cache/l1-l2-cache.png&quot; 
alt=&quot;l1-l2-cache&quot; /&gt;&lt;/p&gt;
-
-&lt;h3 
id=&quot;query-cache-expiration-strategy-by-signature-checking&quot;&gt;Query 
Cache Expiration Strategy by Signature Checking&lt;/h3&gt;
-
-&lt;p&gt;Currently, to invalid query cache, &lt;code 
class=&quot;highlighter-rouge&quot;&gt;CacheService&lt;/code&gt; will either 
invoke &lt;code 
class=&quot;highlighter-rouge&quot;&gt;cleanDataCache&lt;/code&gt; or &lt;code 
class=&quot;highlighter-rouge&quot;&gt;cleanAllDataCache&lt;/code&gt;. Both 
methods will clear all of the query cache , which is very inefficient and 
unnecessary. In production environment, there’s around hundreds of cubing 
jobs per day, which means the query cache will be cleared very several minutes. 
Then we introduced a signature to upgrade cache invalidation strategy.&lt;/p&gt;
-
-&lt;p&gt;The basic idea is as follows:&lt;br /&gt;
-When put SQLResponse into cache, we add signature for each SQLResponse. To 
calculate signature for SQLResponse, we choose the cube last build time and its 
segments to as input of &lt;code 
class=&quot;highlighter-rouge&quot;&gt;SignatureCalculator&lt;/code&gt;.&lt;br 
/&gt;
-When fetch &lt;code 
class=&quot;highlighter-rouge&quot;&gt;SQLResponse&lt;/code&gt; for cache, 
first check whether the signature is consistent. If not, this cached value is 
overdue and will be invalidate.&lt;/p&gt;
-
-&lt;p&gt;As for the calculation of signature is show as follows:&lt;br /&gt;
-1. &lt;code class=&quot;highlighter-rouge&quot;&gt;toString&lt;/code&gt; of 
&lt;code class=&quot;highlighter-rouge&quot;&gt;ComponentSignature&lt;/code&gt; 
will concatenate member varible into a large String; if a &lt;code 
class=&quot;highlighter-rouge&quot;&gt;ComponentSignature&lt;/code&gt; has 
other &lt;code 
class=&quot;highlighter-rouge&quot;&gt;ComponentSignature&lt;/code&gt; as 
member, toString will be calculated recursively&lt;br /&gt;
-2. return value of &lt;code 
class=&quot;highlighter-rouge&quot;&gt;toString&lt;/code&gt; will be input of 
&lt;code 
class=&quot;highlighter-rouge&quot;&gt;SignatureCalculator&lt;/code&gt;,&lt;br 
/&gt;
-&lt;code 
class=&quot;highlighter-rouge&quot;&gt;SignatureCalculator&lt;/code&gt; encode 
string using MD5 as identifer of signature of query cache&lt;/p&gt;
-
-&lt;p&gt;&lt;img 
src=&quot;/images/blog/refine-query-cache/cache-signature.png&quot; 
alt=&quot;cache-signature&quot; /&gt;&lt;/p&gt;
-
-&lt;h3 id=&quot;other-enhancement&quot;&gt;Other Enhancement&lt;/h3&gt;
-
-&lt;h4 
id=&quot;improve-the-query-execution-for-a-set-of-duplicate-queries-in-a-short-period&quot;&gt;Improve
 the query execution for a set of duplicate queries in a short period&lt;/h4&gt;
-
-&lt;p&gt;If same query enter Kylin at the same time by different client, for 
each query they can not find query cache so they must be calculated 
respectively. And even wrose, if these query are complex, they usually cost a 
long duration so Kylin have less chance to utilize cache query; and them cost 
large computation resources that will make query server has poor performance 
has harm to HBase cluster.&lt;/p&gt;
-
-&lt;p&gt;To reduce the impact of duplicated and complex query, it may be a 
good idea to block query which came later, wait to first one return result as 
far as possible. This lazy strategy is especially useful if you have duplicated 
complex query came in same time. To enbale it, you should set &lt;code 
class=&quot;highlighter-rouge&quot;&gt;kylin.query.lazy-query-enabled&lt;/code&gt;
 to &lt;code class=&quot;highlighter-rouge&quot;&gt;true&lt;/code&gt;. 
Optionlly, you may set &lt;code 
class=&quot;highlighter-rouge&quot;&gt;kylin.query.lazy-query-waiting-timeout-milliseconds&lt;/code&gt;
 to what you think later duplicated query wait duration to meet your 
situation.&lt;/p&gt;
-
-&lt;h4 id=&quot;remove-exception-cache&quot;&gt;Remove exception 
cache&lt;/h4&gt;
-&lt;p&gt;Formerly, query cache has been divided into two part, one part for 
storing success query result, another for failed query result, and they are 
invalidated respectively. It looks like not a good classification criteria 
because it is not fine-grained enough. After query cache signature was 
introduced, we have no reason to take them apart, so exception cache was 
removed.&lt;/p&gt;
-
-&lt;hr /&gt;
-
-&lt;h2 id=&quot;part-iii-how-to-use&quot;&gt;Part-III How to Use&lt;/h2&gt;
-
-&lt;p&gt;To get prepared, you need to install memcached, you may refer to 
https://github.com/memcached/memcached/wiki/Install. Then you should modify 
&lt;code class=&quot;highlighter-rouge&quot;&gt;kylin.properties&lt;/code&gt; 
and &lt;code 
class=&quot;highlighter-rouge&quot;&gt;applicationContext.xml&lt;/code&gt;.&lt;/p&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;kylin.properties&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-groff&quot; 
data-lang=&quot;groff&quot;&gt;kylin.cache.memcached.hosts=10.1.2.42:11211
-kylin.query.cache-signature-enabled=true
-kylin.query.lazy-query-enabled=true
-kylin.metrics.memcached.enabled=true
-kylin.query.segment-cache-enabled=true&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;applicationContext.xml&lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code 
class=&quot;language-groff&quot; 
data-lang=&quot;groff&quot;&gt;&amp;lt;cache:annotation-driven/&amp;gt;
-
-&amp;lt;bean id=&quot;ehcache&quot; 
class=&quot;org.springframework.cache.ehcache.EhCacheManagerFactoryBean&quot;
-      p:configLocation=&quot;classpath:ehcache-test.xml&quot; 
p:shared=&quot;true&quot;/&amp;gt;
-
-&amp;lt;bean id=&quot;remoteCacheManager&quot; 
class=&quot;org.apache.kylin.cache.cachemanager.MemcachedCacheManager&quot;/&amp;gt;
-&amp;lt;bean id=&quot;localCacheManager&quot; 
class=&quot;org.apache.kylin.cache.cachemanager.InstrumentedEhCacheCacheManager&quot;
-      p:cacheManager-ref=&quot;ehcache&quot;/&amp;gt;
-&amp;lt;bean id=&quot;cacheManager&quot; 
class=&quot;org.apache.kylin.cache.cachemanager.RemoteLocalFailOverCacheManager&quot;/&amp;gt;
-
-&amp;lt;bean id=&quot;memcachedCacheConfig&quot; 
class=&quot;org.apache.kylin.cache.memcached.MemcachedCacheConfig&quot;&amp;gt;
-    &amp;lt;property name=&quot;timeout&quot; value=&quot;500&quot;/&amp;gt;
-    &amp;lt;property name=&quot;hosts&quot; 
value=&quot;${kylin.cache.memcached.hosts}&quot;/&amp;gt;
-&amp;lt;/bean&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
-
-&lt;h3 id=&quot;configuration-for-query-cache&quot;&gt;Configuration for query 
cache&lt;/h3&gt;
-
-&lt;h4 id=&quot;general-part&quot;&gt;General part&lt;/h4&gt;
-
-&lt;table&gt;
-  &lt;thead&gt;
-    &lt;tr&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Conf Key&lt;/th&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Conf value&lt;/th&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Explanation&lt;/th&gt;
-    &lt;/tr&gt;
-  &lt;/thead&gt;
-  &lt;tbody&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.cache-enabled&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;boolean, default 
true&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;whether to enable query 
cache&lt;/td&gt;
-    &lt;/tr&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.cache-threshold-duration&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;long, in milliseconds, 
default is 2000&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;query duration 
threshold&lt;/td&gt;
-    &lt;/tr&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.cache-threshold-scan-count&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;long, default is 
10240&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;query scan row count 
threshold&lt;/td&gt;
-    &lt;/tr&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.cache-threshold-scan-bytes&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;long, default is 1024 * 
1024 (1MB)&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;query scan byte 
threshold&lt;/td&gt;
-    &lt;/tr&gt;
-  &lt;/tbody&gt;
-&lt;/table&gt;
-
-&lt;h4 id=&quot;memcached-part&quot;&gt;Memcached part&lt;/h4&gt;
-
-&lt;table&gt;
-  &lt;thead&gt;
-    &lt;tr&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Conf Key&lt;/th&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Conf value&lt;/th&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Explanation&lt;/th&gt;
-    &lt;/tr&gt;
-  &lt;/thead&gt;
-  &lt;tbody&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.cache.memcached.hosts&lt;/td&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;host1:port1,host2:port2&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;host list of memcached 
host&lt;/td&gt;
-    &lt;/tr&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.segment-cache-enabled&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;default false&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;wether to enable&lt;/td&gt;
-    &lt;/tr&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.segment-cache-timeout&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;default 2000&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;timeout of 
memcached&lt;/td&gt;
-    &lt;/tr&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.segment-cache-max-size&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;200 (MB)&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;max size put into 
memcached&lt;/td&gt;
-    &lt;/tr&gt;
-  &lt;/tbody&gt;
-&lt;/table&gt;
-
-&lt;h4 id=&quot;cache-signature-part&quot;&gt;Cache signature part&lt;/h4&gt;
-
-&lt;table&gt;
-  &lt;thead&gt;
-    &lt;tr&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Conf Key&lt;/th&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Conf value&lt;/th&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Explanation&lt;/th&gt;
-    &lt;/tr&gt;
-  &lt;/thead&gt;
-  &lt;tbody&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.cache-signature-enabled&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;default false&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;whether to use signature 
for query cache&lt;/td&gt;
-    &lt;/tr&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.signature-class&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;default is 
org.apache.kylin.rest.signature.FactTableRealizationSetCalculator&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;use which class to 
calculate signature of query cache&lt;/td&gt;
-    &lt;/tr&gt;
-  &lt;/tbody&gt;
-&lt;/table&gt;
-
-&lt;h4 id=&quot;other-optimize-part&quot;&gt;Other optimize part&lt;/h4&gt;
-
-&lt;table&gt;
-  &lt;thead&gt;
-    &lt;tr&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Conf Key&lt;/th&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Conf value&lt;/th&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Explanation&lt;/th&gt;
-    &lt;/tr&gt;
-  &lt;/thead&gt;
-  &lt;tbody&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.lazy-query-enabled&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;default false&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;whether to block duplicated 
sql query&lt;/td&gt;
-    &lt;/tr&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.query.lazy-query-waiting-timeout-milliseconds&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;long , in milliseconds, 
default is 60000&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;max druation for blocking 
duplicated sql query&lt;/td&gt;
-    &lt;/tr&gt;
-  &lt;/tbody&gt;
-&lt;/table&gt;
-
-&lt;h4 id=&quot;metrics-part&quot;&gt;Metrics part&lt;/h4&gt;
-
-&lt;table&gt;
-  &lt;thead&gt;
-    &lt;tr&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Conf Key&lt;/th&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Conf value&lt;/th&gt;
-      &lt;th style=&quot;text-align: left&quot;&gt;Explanation&lt;/th&gt;
-    &lt;/tr&gt;
-  &lt;/thead&gt;
-  &lt;tbody&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.metrics.memcached.enabled&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;true&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;Enable memcached metrics in 
memcached.&lt;/td&gt;
-    &lt;/tr&gt;
-    &lt;tr&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;kylin.metrics.memcached.metricstype&lt;/td&gt;
-      &lt;td style=&quot;text-align: 
left&quot;&gt;off/performance/debug&lt;/td&gt;
-      &lt;td style=&quot;text-align: left&quot;&gt;refer to 
net.spy.memcached.metrics.MetricType&lt;/td&gt;
-    &lt;/tr&gt;
-  &lt;/tbody&gt;
-&lt;/table&gt;
-</description>
-        <pubDate>Tue, 30 Jul 2019 03:30:00 -0700</pubDate>
-        
<link>http://kylin.apache.org/blog/2019/07/30/detailed-analysis-of-refine-query-cache/</link>
-        <guid 
isPermaLink="true">http://kylin.apache.org/blog/2019/07/30/detailed-analysis-of-refine-query-cache/</guid>
-        
-        
-        <category>blog</category>
-        
-      </item>
-    
-      <item>
-        <title>Deep dive into Kylin&#39;s Real-time OLAP</title>
-        <description>&lt;h2 id=&quot;preface&quot;&gt;Preface&lt;/h2&gt;
-
-&lt;p&gt;At the beginning of Apache Kylin, the main purpose was to solve the 
need for interactive data analysis on massive data. The data source mainly 
comes from the data warehouse (Hive), and the data is mostly historical rather 
than real-time. Streaming data processing is an brand-new field of big data 
development that requires data to be queried as soon as it enters the 
system(second latency). Until now (the latest release of v2.6), Apache 
Kylin’s main capabilities are still in the field of historical data analysis, 
even the NRT(Near real-time streaming) feature was introduced in v1.6, there 
are still several minutes of delay, it is difficult to meet real-time query 
requirements.&lt;/p&gt;
-
-&lt;p&gt;To keep up with the trend of big data development, 
&lt;strong&gt;eBay&lt;/strong&gt;’s Kylin development team (&lt;a 
href=&quot;https://github.com/allenma&quot;&gt;allenma&lt;/a&gt;, &lt;a 
href=&quot;https://github.com/mingmwang&quot;&gt;mingmwang&lt;/a&gt;, &lt;a 
href=&quot;Https://github.com/sanjulian&quot;&gt;sanjulian&lt;/a&gt;, &lt;a 
href=&quot;https://github.com/wangshisan&quot;&gt;wangshisan&lt;/a&gt;, etc.) 
Based on Kylin, the Real-time OLAP feature was developed to implement Kylin’s 
real-time query of Kafka streaming data. This feature has been used in 
&lt;strong&gt;eBay&lt;/strong&gt; in production env and has been running stably 
for more than one year. It was contributed to community in the December of 
2018.&lt;/p&gt;
-
-&lt;p&gt;In this article, we will focus on introducing and analyzing Apache 
Kylin’s Real-time OLAP feature, usage, benchmarking, etc. In 
&lt;strong&gt;What is Real-time OLAP&lt;/strong&gt;, we will introduce 
architecture, concepts and features. In &lt;strong&gt;How to use Real-time 
OLAP&lt;/strong&gt;, we will introduce the deployment, enabling and monitoring 
aspects of the Receiver cluster. Finally, in the &lt;strong&gt;Real-time OLAP 
FAQ&lt;/strong&gt;, we will introduce the answers to some common questions. The 
meaning of important configuration entry, usage restrictions, and future 
development plans.&lt;/p&gt;
-
-&lt;ul&gt;
-  &lt;li&gt;
-    &lt;p&gt;What is Real-time OLAP&lt;/p&gt;
-
-    &lt;ul&gt;
-      &lt;li&gt;The importance of streaming data processing&lt;/li&gt;
-      &lt;li&gt;Introduction to Real-time OLAP&lt;/li&gt;
-      &lt;li&gt;Real-time OLAP concepts and roles&lt;/li&gt;
-      &lt;li&gt;Real-time OLAP architecture&lt;/li&gt;
-      &lt;li&gt;Real-time OLAP features&lt;/li&gt;
-      &lt;li&gt;Real-time OLAP metadata&lt;/li&gt;
-      &lt;li&gt;Real-time OLAP Local Segment Cache&lt;/li&gt;
-      &lt;li&gt;The status of Streaming Segment and its 
transformation&lt;/li&gt;
-      &lt;li&gt;Real-time OLAP build process analysis&lt;/li&gt;
-      &lt;li&gt;Real-time OLAP query process analysis&lt;/li&gt;
-      &lt;li&gt;Real-time OLAP Rebalance process analysis&lt;/li&gt;
-    &lt;/ul&gt;
-  &lt;/li&gt;
-  &lt;li&gt;
-    &lt;p&gt;How to use Real-time OLAP&lt;/p&gt;
-
-    &lt;ul&gt;
-      &lt;li&gt;Deploy Coordinator and Receiver&lt;/li&gt;
-      &lt;li&gt;Configuring Streaming Table&lt;/li&gt;
-      &lt;li&gt;Add and modify Replica Set&lt;/li&gt;
-      &lt;li&gt;Design model and cube&lt;/li&gt;
-      &lt;li&gt;Enable and stop Cube&lt;/li&gt;
-      &lt;li&gt;Monitor consumption status&lt;/li&gt;
-      &lt;li&gt;Coordinator Rest API Description&lt;/li&gt;
-    &lt;/ul&gt;
-  &lt;/li&gt;
-  &lt;li&gt;
-    &lt;p&gt;Frequently Asked Questions for Real-time OLAP&lt;/p&gt;
-
-    &lt;ul&gt;
-      &lt;li&gt;There is a “Lambda” checkbox when configuring the Kafka 
data source. What does it do?&lt;/li&gt;
-      &lt;li&gt;In addition to the base cuboid, can I build other cuboids on 
the receiver side?&lt;/li&gt;
-      &lt;li&gt;How should I scale out my receiver cluster?  How to deal with 
partition increase for Kafka topic?&lt;/li&gt;
-      &lt;li&gt;What is the benchmark result? What is the approximate length 
of the query? What is the approximate data ingest rate of a single 
Receiver?&lt;/li&gt;
-      &lt;li&gt;Which one is more suitable for my needs than Kylin’s NRT 
Streaming?&lt;/li&gt;
-      &lt;li&gt;What are the main limitations of Real-time OLAP? What are the 
future development plans?&lt;/li&gt;
-    &lt;/ul&gt;
-  &lt;/li&gt;
-&lt;/ul&gt;
-
-&lt;h2 id=&quot;part-i-what-is-real-time-olap-for-kylin&quot;&gt;Part-I. What 
is Real-time OLAP for Kylin&lt;/h2&gt;
-
-&lt;hr /&gt;
-
-&lt;h3 id=&quot;streaming-data-processing-and-real-time-olap&quot;&gt;1.1 
Streaming Data Processing and Real-time OLAP&lt;/h3&gt;
-&lt;p&gt;For many commercial companies, user messages are analyzed for the 
purpose of making better business decisions and better market planning. If the 
message enters the data analysis platform earlier, decision makers can respond 
faster, reducing time and money waste. Streaming data processing means faster 
feedback, and decision makers can make more frequent and flexible planning 
adjustments.&lt;/p&gt;
-
-&lt;p&gt;There are various types of data sources in the company, including 
mobile devices such as servers and mobile phones, and IoT devices. Messages 
from different sources are often distinguished by different topic and 
aggregated into a message queue (Message Queue/Message Bus) for data analysis. 
Traditional data analysis tools use batch tools such as MapReduce for data 
analysis, which has large data delays, typically hours to days. As you can see 
from the figure below, the main data latency comes from two processes: 
extracting from the message queue through the ETL process to the data 
warehouse, and extracting data from the data warehouse for precomputation to 
save the results as cube data. Since both of these parts are calculated using 
batch-compute programs, the calculation take a long time , which make real-time 
query difficult to achieve. We think to solve the problem, we need to bypass 
these processes, by building a bridge between data collection and OLAP 
platforms. Let the 
 data go directly to the OLAP platform.&lt;/p&gt;
-
-&lt;p&gt;&lt;img 
src=&quot;/images/blog/deep-dive-realtime-olap/pic-1.png&quot; 
alt=&quot;diagram1&quot; /&gt;&lt;/p&gt;
-
-&lt;p&gt;There are already some mature real-time OLAP solutions, such as 
Druid, that provide lower data latency by combining query results in real-time 
and historical parts. Kylin has reached a certain level in analyzing massive 
historical data. In order to take a step toward real-time OLAP, Kylin 
developers have developed Real-time OLAP.&lt;/p&gt;
-
-&lt;hr /&gt;
-
-&lt;h3 id=&quot;introduction-to-real-time-olap&quot;&gt;1.2 Introduction to 
Real-time OLAP&lt;/h3&gt;
-

[... 383 lines stripped ...]
Added: kylin/site/images/blog/local-cache/Local_cache_stage.png
URL: 
http://svn.apache.org/viewvc/kylin/site/images/blog/local-cache/Local_cache_stage.png?rev=1894464&view=auto
==============================================================================
Binary file - no diff available.


Reply via email to