Hi Chunen and Kang, is there any follow-up JIRA for this? Best regards,
Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org nichunen <n...@apache.org> 于2020年1月13日周一 上午10:46写道: > Agree with Yaqian, we may set the default value to FALSE > > > > Best regards, > > > > Ni Chunen / George > > > > On 01/9/2020 10:41,Zhou Kang<zhoukan...@outlook.com> wrote: > ( ̄▽ ̄)” Seems mail list disable rich text. > kylin sample data > > +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ > | Query Result Size | Compress Time | Query Duration(Compress) | Query > Duration(Uncompressed) | > > +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ > | 0.25M | 5ms | 0.18s | 0.23s > | > | 0.5M | 20ms | 0.38s | 0.38s > | > | 0.7M | 25ms | 0.52s | 0.45s > | > > +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ > > SSB data > > +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ > | Query Result Size | Compress Time | Query Duration(Compress) | Query > Duration(Uncompressed) | > > +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ > | 0.25M | 4ms | 0.12s | 0.15s > | > | 0.5M | 7ms | 0.25s | 0.24s > | > | 0.7M | 10ms | 0.35s | 0.35s > | > | 1M | 13ms | 0.41s | 0.39s > | > | 5M | 63ms | 2.26s | 2.27s > | > | 10M | 135ms | 5.10s | 4.90s > | > | 16M | 215ms | 7.89s | 7.60s > | > > +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ > 发件人: Zhou Kang <zhoukan...@outlook.com> > 答复: "dev@kylin.apache.org" <dev@kylin.apache.org> > 日期: 2020年1月9日 星期四 上午10:34 > 收件人: "dev@kylin.apache.org" <dev@kylin.apache.org>, Yaqian Zhang < > yaqian_zh...@126.com> > 主题: Re: [DISCUSS] Cost-benefit of HBase scan result compression > > Hi, Yaqian Zhang: > > Thanks for your query latency tests. > > I retyped the test data for easy reading > > kylin sample data > Query Result Size > Compress Time > Query Duration > (Compress) > Query Duration > (Uncompressed) > 0.25M > 5ms > 0.18s > 0.23s > 0.5M > 20ms > 0.38s > 0.38s > 0.7M > 25ms > 0.52s > 0.45s > > SSB data > Query Result Size > Compress Time > Query Duration > (Compress) > Query Duration > (Uncompressed) > 0.25M > 4ms > 0.12s > 0.15s > 0.5M > 7ms > 0.25s > 0.24s > 0.7M > 10ms > 0.35s > 0.35s > 1M > 13ms > 0.41s > 0.39s > 5M > 63ms > 2.26s > 2.27s > 10M > 135ms > 5.10s > 4.90s > 16M > 215ms > 7.89s > 7.60s > > > 发件人: Yaqian Zhang <yaqian_zh...@126.com<mailto:yaqian_zh...@126.com>> > 答复: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" < > dev@kylin.apache.org<mailto:dev@kylin.apache.org>> > 日期: 2020年1月8日 星期三 下午8:04 > 收件人: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" < > dev@kylin.apache.org<mailto:dev@kylin.apache.org>> > 主题: Re: [DISCUSS] Cost-benefit of HBase scan result compression > > Hi: > > I have tested the query time latency in In both cases. > > In our CDH cluster environment, I get the following experimental results. > > kylin sample data > Query Result Size > Compress Time > Query Duration(Compress) > Query Duration(Uncompressed) > 0.25M > 5ms > 0.18s > 0.23s > 0.5M > 20ms > 0.38s > 0.38s > 0.7M > 25ms > 0.52s > 0.45s > > SSB data > Query Result Size > Compress Time > Query Duration(Compress) > Query Duration(Uncompressed) > 0.25M > 4ms > 0.12s > 0.15s > 0.5M > 7ms > 0.25s > 0.24s > 0.7M > 10ms > 0.35s > 0.35s > 1M > 13ms > 0.41s > 0.39s > 5M > 63ms > 2.26s > 2.27s > 10M > 135ms > 5.10s > 4.90s > 16M > 215ms > 7.89s > 7.60s > > Conclusion: > Enable compression will improve query speed when result size<0.5M. > Turning on compression will reduce query speed in general when result > size>1M. > > So,it is recommended to set the default value of > kylin.storage.hbase.endpoint-compress-result to false. > > > 在 2020年1月4日,19:35,Yaqian Zhang <yaqian_zh...@126.com<mailto: > yaqian_zh...@126.com><mailto:yaqian_zh...@126.com><mailto: > yaqian_zh...@126.com%3e>> 写道: > HI Kang: > Thank you for your compare and report! > I will test and verify the query time latency for this! > 在 2020年1月3日,10:32,Zhou Kang <zhoukan...@outlook.com<mailto: > zhoukan...@outlook.com><mailto:zhoukan...@outlook.com><mailto: > zhoukan...@outlook.com%3e>> 写道: > Hi,all > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In Xiaomi Group, we found compression will cause query time latency up to > 30 sec and more. After we analyze log in HBase, we found compression is > useless in most situations. > Detail info you can see in : > https://issues.apache.org/jira/browse/KYLIN-4322 > And more, in our environment, > 1. Only 0.05% data is bigger than 1M > 2. Almost 70% compression data is larger than source data. > So, should we set this config FALSE as default. > And, kylin.storage.hbase.endpoint-compress-result should be override in > cube or project, which is forbidden in CubeVisitService:visitCube now. > > > >