[
https://issues.apache.org/jira/browse/KYLIN-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dayue Gao updated KYLIN-2501:
-
Description:
*Problem*
When query server needs to handle millions of records, CubeTupleConverter could
become performance bottleneck.
An experiment shows that converting 5 millions records takes ~11s, which
accounts for 50% of the total query time.
*Motivation*
Records returned from each storage partition is guaranteed to be ordered.
Therefore we could reduce the number of records passed to CubeTupleConverter by
# merge sorted records from all partitions, similar to what we have done in
KYLIN-1787
# use a [stream
aggregate|https://blogs.msdn.microsoft.com/craigfr/2006/09/13/stream-aggregate/]
algorithm on merged stream to aggregate those records with the same key
*Proposal*
# Add a new physical operator GTStreamAggregateScanner which implements the
stream aggregate algorithm
# Refine SortedIteratorMergerWithLimit that was used to merge sort records from
different partitions. The previous implementation has performance issues
(KYLIN-2483) due to expensive record clone
# Leverage GTStreamAggregateScanner to aggregate records on merged stream
*Scope*
Stream aggregate has some good properties such as low memory usage and
streamable ordered outputs, making it better than hash/sort based alternatives
when input is already sorted. So I bet the new GTStreamAggregateScanner
operator can also be used to accelerate cubing and coprocessor aggregation in
certain cases. I'll focus on query server in this jira and leave those
optimizations as future works.
was:
*Problem*
When query server needs to handle millions of records from storage,
CubeTupleConverter could become performance bottleneck.
An experiment shows that converting 5 millions records takes ~11s, which
accounts for 50% of the total query time.
*Motivation*
Records returned from each storage partition is guaranteed to be ordered.
Therefore we could reduce the number of records passed to CubeTupleConverter by
# merge sorted records from all partitions, similar to what we have done in
KYLIN-1787
# use a [stream
aggregate|https://blogs.msdn.microsoft.com/craigfr/2006/09/13/stream-aggregate/]
algorithm on merged stream to aggregate those records with the same key
*Proposal*
# Add a new physical operator GTStreamAggregateScanner which implements the
stream aggregate algorithm
# Refine SortedIteratorMergerWithLimit that was used to merge sort records from
different partitions. The previous implementation has performance issues
(KYLIN-2483) due to expensive record clone
# Leverage GTStreamAggregateScanner to aggregate records on merged stream
*Scope*
Stream aggregate has some good properties such as low memory usage and
streamable ordered outputs, making it better than hash/sort based alternatives
when input is already sorted. So I bet the new GTStreamAggregateScanner
operator can also be used to accelerating cubing and coprocessor in certain
cases. I will leave it as future works.
> Stream Aggregate GTRecords at Query Server
> --
>
> Key: KYLIN-2501
> URL: https://issues.apache.org/jira/browse/KYLIN-2501
> Project: Kylin
> Issue Type: Improvement
> Components: Query Engine, Storage - HBase
>Affects Versions: v1.6.0
>Reporter: Dayue Gao
>Assignee: Dayue Gao
>
> *Problem*
> When query server needs to handle millions of records, CubeTupleConverter
> could become performance bottleneck.
> An experiment shows that converting 5 millions records takes ~11s, which
> accounts for 50% of the total query time.
> *Motivation*
> Records returned from each storage partition is guaranteed to be ordered.
> Therefore we could reduce the number of records passed to CubeTupleConverter
> by
> # merge sorted records from all partitions, similar to what we have done in
> KYLIN-1787
> # use a [stream
> aggregate|https://blogs.msdn.microsoft.com/craigfr/2006/09/13/stream-aggregate/]
> algorithm on merged stream to aggregate those records with the same key
> *Proposal*
> # Add a new physical operator GTStreamAggregateScanner which implements the
> stream aggregate algorithm
> # Refine SortedIteratorMergerWithLimit that was used to merge sort records
> from different partitions. The previous implementation has performance issues
> (KYLIN-2483) due to expensive record clone
> # Leverage GTStreamAggregateScanner to aggregate records on merged stream
> *Scope*
> Stream aggregate has some good properties such as low memory usage and
> streamable ordered outputs, making it better than hash/sort based
> alternatives when input is already sorted. So I bet the new
> GTStreamAggregateScanner operator can also be used to accelerate cubing and
> coprocessor aggregation in certain cases. I'll focus on query server in this
> jira and