Hi Chunen and Kang, is there any follow-up JIRA for this?

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




nichunen <n...@apache.org> 于2020年1月13日周一 上午10:46写道:

> Agree with Yaqian, we may set the default value to FALSE
>
>
>
> Best regards,
>
>
>
> Ni Chunen / George
>
>
>
> On 01/9/2020 10:41,Zhou Kang<zhoukan...@outlook.com> wrote:
> ( ̄▽ ̄)” Seems mail list disable rich text.
> kylin sample data
>
> +────────────────────+────────────────+───────────────────────────+───────────────────────────────+
> | Query Result Size  | Compress Time  | Query Duration(Compress)  | Query
> Duration(Uncompressed)  |
>
> +────────────────────+────────────────+───────────────────────────+───────────────────────────────+
> | 0.25M              | 5ms            | 0.18s                     | 0.23s
>                        |
> | 0.5M               | 20ms           | 0.38s                     | 0.38s
>                        |
> | 0.7M               | 25ms           | 0.52s                     | 0.45s
>                        |
>
> +────────────────────+────────────────+───────────────────────────+───────────────────────────────+
>
> SSB data
>
> +────────────────────+────────────────+───────────────────────────+───────────────────────────────+
> | Query Result Size  | Compress Time  | Query Duration(Compress)  | Query
> Duration(Uncompressed)  |
>
> +────────────────────+────────────────+───────────────────────────+───────────────────────────────+
> | 0.25M              | 4ms            | 0.12s                     | 0.15s
>                        |
> | 0.5M               | 7ms            | 0.25s                     | 0.24s
>                        |
> | 0.7M               | 10ms           | 0.35s                     | 0.35s
>                        |
> | 1M                 | 13ms           | 0.41s                     | 0.39s
>                        |
> | 5M                 | 63ms           | 2.26s                     | 2.27s
>                        |
> | 10M                | 135ms          | 5.10s                     | 4.90s
>                        |
> | 16M                | 215ms          | 7.89s                     | 7.60s
>                        |
>
> +────────────────────+────────────────+───────────────────────────+───────────────────────────────+
> 发件人: Zhou Kang <zhoukan...@outlook.com>
> 答复: "dev@kylin.apache.org" <dev@kylin.apache.org>
> 日期: 2020年1月9日 星期四 上午10:34
> 收件人: "dev@kylin.apache.org" <dev@kylin.apache.org>, Yaqian Zhang <
> yaqian_zh...@126.com>
> 主题: Re: [DISCUSS] Cost-benefit of HBase scan result compression
>
> Hi, Yaqian Zhang:
>
> Thanks for your query latency tests.
>
> I retyped the test data for easy reading
>
> kylin sample data
> Query Result Size
> Compress Time
> Query Duration
> (Compress)
> Query Duration
> (Uncompressed)
> 0.25M
> 5ms
> 0.18s
> 0.23s
> 0.5M
> 20ms
> 0.38s
> 0.38s
> 0.7M
> 25ms
> 0.52s
> 0.45s
>
> SSB data
> Query Result Size
> Compress Time
> Query Duration
> (Compress)
> Query Duration
> (Uncompressed)
> 0.25M
> 4ms
> 0.12s
> 0.15s
> 0.5M
> 7ms
> 0.25s
> 0.24s
> 0.7M
> 10ms
> 0.35s
> 0.35s
> 1M
> 13ms
> 0.41s
> 0.39s
> 5M
> 63ms
> 2.26s
> 2.27s
> 10M
> 135ms
> 5.10s
> 4.90s
> 16M
> 215ms
> 7.89s
> 7.60s
>
>
> 发件人: Yaqian Zhang <yaqian_zh...@126.com<mailto:yaqian_zh...@126.com>>
> 答复: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" <
> dev@kylin.apache.org<mailto:dev@kylin.apache.org>>
> 日期: 2020年1月8日 星期三 下午8:04
> 收件人: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" <
> dev@kylin.apache.org<mailto:dev@kylin.apache.org>>
> 主题: Re: [DISCUSS] Cost-benefit of HBase scan result compression
>
> Hi:
>
> I have tested the  query time latency in In both cases.
>
> In our CDH cluster environment, I get the following experimental results.
>
> kylin sample data
> Query Result Size
> Compress Time
> Query Duration(Compress)
> Query Duration(Uncompressed)
> 0.25M
> 5ms
> 0.18s
> 0.23s
> 0.5M
> 20ms
> 0.38s
> 0.38s
> 0.7M
> 25ms
> 0.52s
> 0.45s
>
> SSB data
> Query Result Size
> Compress Time
> Query Duration(Compress)
> Query Duration(Uncompressed)
> 0.25M
> 4ms
> 0.12s
> 0.15s
> 0.5M
> 7ms
> 0.25s
> 0.24s
> 0.7M
> 10ms
> 0.35s
> 0.35s
> 1M
> 13ms
> 0.41s
> 0.39s
> 5M
> 63ms
> 2.26s
> 2.27s
> 10M
> 135ms
> 5.10s
> 4.90s
> 16M
> 215ms
> 7.89s
> 7.60s
>
> Conclusion:
> Enable compression will improve query speed when result size<0.5M.
> Turning on compression will reduce query speed in general when result
> size>1M.
>
> So,it is recommended to set the default value of
> kylin.storage.hbase.endpoint-compress-result to false.
>
>
> 在 2020年1月4日,19:35,Yaqian Zhang <yaqian_zh...@126.com<mailto:
> yaqian_zh...@126.com><mailto:yaqian_zh...@126.com><mailto:
> yaqian_zh...@126.com%3e>> 写道:
> HI Kang:
> Thank you for your compare and report!
> I will test and verify the query time latency for this!
> 在 2020年1月3日,10:32,Zhou Kang <zhoukan...@outlook.com<mailto:
> zhoukan...@outlook.com><mailto:zhoukan...@outlook.com><mailto:
> zhoukan...@outlook.com%3e>> 写道:
> Hi,all
> kylin.storage.hbase.endpoint-compress-result is TRUE as default.
> In Xiaomi Group, we found compression will cause query time latency up to
> 30 sec and more. After we analyze log in HBase, we found compression is
> useless in most situations.
> Detail info you can see in :
> https://issues.apache.org/jira/browse/KYLIN-4322
> And more, in our environment,
> 1.     Only 0.05% data is bigger than 1M
> 2.     Almost 70% compression data is larger than source data.
> So, should we set this config FALSE as default.
> And, kylin.storage.hbase.endpoint-compress-result should be override in
> cube or project, which is forbidden in CubeVisitService:visitCube now.
>
>
>
>

Reply via email to