[ 
https://issues.apache.org/jira/browse/DRILL-229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Becker updated DRILL-229:
-----------------------------

    Description: 
Build a merging receiver operator which combines a number of incoming batches 
into a single outgoing batch.

h3. Overview
Incoming batches are individually presorted before reaching this operator, so a 
priority queue is built with each value to determine which batch contains the 
next record.  When the next value is removed from the priority queue, the 
underlying record is copied from the underlying ValueVectors to the outgoing 
VectorContainer.

The comparator for the priority queue is generated based on the supplied 
LogicalExpression (e.g. a single ValueVectorReadExpression).

h3. Example
The following example illustrates a distributed count operation, where each 
remote fragment counts a subset of the data and the root fragment produces a 
sum of each count aggregate.

h4. Data Flow
{noformat}
RecordReader
  |
  +-> Sort
       |
       +-> StreamingAggregate(COUNT)
              |
              +-> MergingPartitionExchange
                     |
                     +-> StreamingAggregate(SUM)
                            |
                            +-> UnionExchange
                                   |
                                   +-> Screen
{noformat}

h4. Control Flow

{noformat}
Root Fragment
-------------
Screen
   |
   +->UnionExchange
         | | |
         | | +->AggSum
         | |      |
         | |      +->MergingReceiver
         | |
         | +--->AggSum
         |        |
         |        +->MergingReceiver
         |
         +----->AggSum
                  |
                  +->MergingReceiver
             ...

Remote Fragment
---------------
PartitionSender
       |
       +->AggCount
             |
             +->Sort
                 |
                 +->Reader
{noformat}


  was:
Build a merging receiver operator which combines a number of incoming buffers 
into a single output stream by merging the streams based on equality of one or 
more expressions.

The following example illustrates a distributed count operation, where each 
remote fragment counts a subset of the data and the root fragment produces a 
sum of each count aggregate.

h4. Data Flow
{noformat}
RecordReader
  |
  +-> Sort
       |
       +-> StreamingAggregate(COUNT)
              |
              +-> MergingPartitionExchange
                     |
                     +-> StreamingAggregate(SUM)
                            |
                            +-> UnionExchange
                                   |
                                   +-> Screen
{noformat}

h4. Control Flow

{noformat}
Root Fragment
-------------
Screen
   |
   +->UnionExchange
         | | |
         | | +->AggSum
         | |      |
         | |      +->MergingReceiver
         | |
         | +--->AggSum
         |        |
         |        +->MergingReceiver
         |
         +----->AggSum
                  |
                  +->MergingReceiver
             ...

Remote Fragment
---------------
PartitionSender
       |
       +->AggCount
             |
             +->Sort
                 |
                 +->Reader
{noformat}



> Build a Merging Recevier
> ------------------------
>
>                 Key: DRILL-229
>                 URL: https://issues.apache.org/jira/browse/DRILL-229
>             Project: Apache Drill
>          Issue Type: New Feature
>            Reporter: Jacques Nadeau
>            Assignee: Ben Becker
>
> Build a merging receiver operator which combines a number of incoming batches 
> into a single outgoing batch.
> h3. Overview
> Incoming batches are individually presorted before reaching this operator, so 
> a priority queue is built with each value to determine which batch contains 
> the next record.  When the next value is removed from the priority queue, the 
> underlying record is copied from the underlying ValueVectors to the outgoing 
> VectorContainer.
> The comparator for the priority queue is generated based on the supplied 
> LogicalExpression (e.g. a single ValueVectorReadExpression).
> h3. Example
> The following example illustrates a distributed count operation, where each 
> remote fragment counts a subset of the data and the root fragment produces a 
> sum of each count aggregate.
> h4. Data Flow
> {noformat}
> RecordReader
>   |
>   +-> Sort
>        |
>        +-> StreamingAggregate(COUNT)
>               |
>               +-> MergingPartitionExchange
>                      |
>                      +-> StreamingAggregate(SUM)
>                             |
>                             +-> UnionExchange
>                                    |
>                                    +-> Screen
> {noformat}
> h4. Control Flow
> {noformat}
> Root Fragment
> -------------
> Screen
>    |
>    +->UnionExchange
>          | | |
>          | | +->AggSum
>          | |      |
>          | |      +->MergingReceiver
>          | |
>          | +--->AggSum
>          |        |
>          |        +->MergingReceiver
>          |
>          +----->AggSum
>                   |
>                   +->MergingReceiver
>              ...
> Remote Fragment
> ---------------
> PartitionSender
>        |
>        +->AggCount
>              |
>              +->Sort
>                  |
>                  +->Reader
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to