[
https://issues.apache.org/jira/browse/DRILL-229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ben Becker updated DRILL-229:
-----------------------------
Description:
Build a merging receiver operator which combines a number of incoming batches
into a single outgoing batch.
h3. Overview
Incoming batches are individually presorted before reaching this operator, so a
priority queue is built with each value to determine which batch contains the
next record. When the next value is removed from the priority queue, the
underlying record is copied from the underlying ValueVectors to the outgoing
VectorContainer.
The comparator for the priority queue is generated based on the supplied
LogicalExpression (e.g. a single ValueVectorReadExpression).
h3. Example
The following example illustrates a distributed count operation, where each
remote fragment counts a subset of the data and the root fragment produces a
sum of each count aggregate.
h4. Data Flow
{noformat}
RecordReader
|
+-> Sort
|
+-> StreamingAggregate(COUNT)
|
+-> MergingPartitionExchange
|
+-> StreamingAggregate(SUM)
|
+-> UnionExchange
|
+-> Screen
{noformat}
h4. Control Flow
{noformat}
Root Fragment
-------------
Screen
|
+->UnionExchange
| | |
| | +->AggSum
| | |
| | +->MergingReceiver
| |
| +--->AggSum
| |
| +->MergingReceiver
|
+----->AggSum
|
+->MergingReceiver
...
Remote Fragment
---------------
PartitionSender
|
+->AggCount
|
+->Sort
|
+->Reader
{noformat}
was:
Build a merging receiver operator which combines a number of incoming buffers
into a single output stream by merging the streams based on equality of one or
more expressions.
The following example illustrates a distributed count operation, where each
remote fragment counts a subset of the data and the root fragment produces a
sum of each count aggregate.
h4. Data Flow
{noformat}
RecordReader
|
+-> Sort
|
+-> StreamingAggregate(COUNT)
|
+-> MergingPartitionExchange
|
+-> StreamingAggregate(SUM)
|
+-> UnionExchange
|
+-> Screen
{noformat}
h4. Control Flow
{noformat}
Root Fragment
-------------
Screen
|
+->UnionExchange
| | |
| | +->AggSum
| | |
| | +->MergingReceiver
| |
| +--->AggSum
| |
| +->MergingReceiver
|
+----->AggSum
|
+->MergingReceiver
...
Remote Fragment
---------------
PartitionSender
|
+->AggCount
|
+->Sort
|
+->Reader
{noformat}
> Build a Merging Recevier
> ------------------------
>
> Key: DRILL-229
> URL: https://issues.apache.org/jira/browse/DRILL-229
> Project: Apache Drill
> Issue Type: New Feature
> Reporter: Jacques Nadeau
> Assignee: Ben Becker
>
> Build a merging receiver operator which combines a number of incoming batches
> into a single outgoing batch.
> h3. Overview
> Incoming batches are individually presorted before reaching this operator, so
> a priority queue is built with each value to determine which batch contains
> the next record. When the next value is removed from the priority queue, the
> underlying record is copied from the underlying ValueVectors to the outgoing
> VectorContainer.
> The comparator for the priority queue is generated based on the supplied
> LogicalExpression (e.g. a single ValueVectorReadExpression).
> h3. Example
> The following example illustrates a distributed count operation, where each
> remote fragment counts a subset of the data and the root fragment produces a
> sum of each count aggregate.
> h4. Data Flow
> {noformat}
> RecordReader
> |
> +-> Sort
> |
> +-> StreamingAggregate(COUNT)
> |
> +-> MergingPartitionExchange
> |
> +-> StreamingAggregate(SUM)
> |
> +-> UnionExchange
> |
> +-> Screen
> {noformat}
> h4. Control Flow
> {noformat}
> Root Fragment
> -------------
> Screen
> |
> +->UnionExchange
> | | |
> | | +->AggSum
> | | |
> | | +->MergingReceiver
> | |
> | +--->AggSum
> | |
> | +->MergingReceiver
> |
> +----->AggSum
> |
> +->MergingReceiver
> ...
> Remote Fragment
> ---------------
> PartitionSender
> |
> +->AggCount
> |
> +->Sort
> |
> +->Reader
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.1#6144)