[jira] [Updated] (KYLIN-2501) Stream Aggregate GTRecords at Query Server

2017-03-22 Thread Dayue Gao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dayue Gao updated KYLIN-2501:
-
Attachment: KYLIN-2501.patch

Pass CI,patch uploaded.

> Stream Aggregate GTRecords at Query Server
> --
>
> Key: KYLIN-2501
> URL: https://issues.apache.org/jira/browse/KYLIN-2501
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine, Storage - HBase
>Affects Versions: v1.6.0
>Reporter: Dayue Gao
>Assignee: Dayue Gao
> Attachments: KYLIN-2501.patch
>
>
> *Problem*
> When query server needs to handle millions of records, CubeTupleConverter 
> could become performance bottleneck.
> An experiment shows that converting 5 millions records takes ~11s, which 
> accounts for 50% of the total query time.
> *Motivation*
> Records returned from each storage partition is guaranteed to be ordered. 
> Therefore we could reduce the number of records passed to CubeTupleConverter 
> by
> # merge sorted records from all partitions, similar to what we have done in 
> KYLIN-1787
> # use a [stream 
> aggregate|https://blogs.msdn.microsoft.com/craigfr/2006/09/13/stream-aggregate/]
>  algorithm on merged stream to aggregate those records with the same key
> *Proposal*
> # Add a new physical operator GTStreamAggregateScanner which implements the 
> stream aggregate algorithm
> # Refine SortedIteratorMergerWithLimit that was used to merge sort records 
> from different partitions. The previous implementation has performance issues 
> (KYLIN-2483) due to expensive record clone
> # Leverage GTStreamAggregateScanner to aggregate records on merged stream
> *Scope*
> Stream aggregate has some good properties such as low memory usage and 
> streamable ordered outputs, making it better than hash/sort based 
> alternatives when input is already sorted. So I bet the new 
> GTStreamAggregateScanner operator can also be used to accelerate cubing and 
> coprocessor aggregation in certain cases. I'll focus on query server in this 
> jira and leave those optimizations as future works.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KYLIN-2501) Stream Aggregate GTRecords at Query Server

2017-03-10 Thread Dayue Gao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dayue Gao updated KYLIN-2501:
-
Description: 
*Problem*
When query server needs to handle millions of records, CubeTupleConverter could 
become performance bottleneck.
An experiment shows that converting 5 millions records takes ~11s, which 
accounts for 50% of the total query time.

*Motivation*
Records returned from each storage partition is guaranteed to be ordered. 
Therefore we could reduce the number of records passed to CubeTupleConverter by
# merge sorted records from all partitions, similar to what we have done in 
KYLIN-1787
# use a [stream 
aggregate|https://blogs.msdn.microsoft.com/craigfr/2006/09/13/stream-aggregate/]
 algorithm on merged stream to aggregate those records with the same key

*Proposal*
# Add a new physical operator GTStreamAggregateScanner which implements the 
stream aggregate algorithm
# Refine SortedIteratorMergerWithLimit that was used to merge sort records from 
different partitions. The previous implementation has performance issues 
(KYLIN-2483) due to expensive record clone
# Leverage GTStreamAggregateScanner to aggregate records on merged stream

*Scope*
Stream aggregate has some good properties such as low memory usage and 
streamable ordered outputs, making it better than hash/sort based alternatives 
when input is already sorted. So I bet the new GTStreamAggregateScanner 
operator can also be used to accelerate cubing and coprocessor aggregation in 
certain cases. I'll focus on query server in this jira and leave those 
optimizations as future works.

  was:
*Problem*
When query server needs to handle millions of records from storage, 
CubeTupleConverter could become performance bottleneck.
An experiment shows that converting 5 millions records takes ~11s, which 
accounts for 50% of the total query time.

*Motivation*
Records returned from each storage partition is guaranteed to be ordered. 
Therefore we could reduce the number of records passed to CubeTupleConverter by
# merge sorted records from all partitions, similar to what we have done in 
KYLIN-1787
# use a [stream 
aggregate|https://blogs.msdn.microsoft.com/craigfr/2006/09/13/stream-aggregate/]
 algorithm on merged stream to aggregate those records with the same key

*Proposal*
# Add a new physical operator GTStreamAggregateScanner which implements the 
stream aggregate algorithm
# Refine SortedIteratorMergerWithLimit that was used to merge sort records from 
different partitions. The previous implementation has performance issues 
(KYLIN-2483) due to expensive record clone
# Leverage GTStreamAggregateScanner to aggregate records on merged stream

*Scope*
Stream aggregate has some good properties such as low memory usage and 
streamable ordered outputs, making it better than hash/sort based alternatives 
when input is already sorted. So I bet the new GTStreamAggregateScanner 
operator can also be used to accelerating cubing and coprocessor in certain 
cases. I will leave it as future works.


> Stream Aggregate GTRecords at Query Server
> --
>
> Key: KYLIN-2501
> URL: https://issues.apache.org/jira/browse/KYLIN-2501
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine, Storage - HBase
>Affects Versions: v1.6.0
>Reporter: Dayue Gao
>Assignee: Dayue Gao
>
> *Problem*
> When query server needs to handle millions of records, CubeTupleConverter 
> could become performance bottleneck.
> An experiment shows that converting 5 millions records takes ~11s, which 
> accounts for 50% of the total query time.
> *Motivation*
> Records returned from each storage partition is guaranteed to be ordered. 
> Therefore we could reduce the number of records passed to CubeTupleConverter 
> by
> # merge sorted records from all partitions, similar to what we have done in 
> KYLIN-1787
> # use a [stream 
> aggregate|https://blogs.msdn.microsoft.com/craigfr/2006/09/13/stream-aggregate/]
>  algorithm on merged stream to aggregate those records with the same key
> *Proposal*
> # Add a new physical operator GTStreamAggregateScanner which implements the 
> stream aggregate algorithm
> # Refine SortedIteratorMergerWithLimit that was used to merge sort records 
> from different partitions. The previous implementation has performance issues 
> (KYLIN-2483) due to expensive record clone
> # Leverage GTStreamAggregateScanner to aggregate records on merged stream
> *Scope*
> Stream aggregate has some good properties such as low memory usage and 
> streamable ordered outputs, making it better than hash/sort based 
> alternatives when input is already sorted. So I bet the new 
> GTStreamAggregateScanner operator can also be used to accelerate cubing and 
> coprocessor aggregation in certain cases. I'll focus on query server in this 
> jira and