[
https://issues.apache.org/jira/browse/SAMZA-482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274107#comment-14274107
]
Yi Pan (Data Infrastructure) commented on SAMZA-482:
----------------------------------------------------
[~milinda], I posted your comment to SAMZA-484, since I felt that the
discussion on data schema and possible DDL belongs to that topic.
As for the specific topics of operators, I think that we are proposing the
following:
#4 different types of operators implemented via 2 interface classes:
## stream-to-relation operators implementing TupleOperator interface and
generate Relation output. E.g. Window operators
## stream-to-stream operators implementing TupleOperator interface and generate
Tuple output. E.g. Partition operators
## relation-to-relation operators implementing RelationOperator interface and
generate Relation output. E.g. All relational algebra operators, s.t. join,
where, group-by, select, etc.
## relation-to-stream operators implementing RelationOperator interface and
generate Tuple output. E.g. Istream or Dstream operators
Those operators are connected via the following two context interface classes:
# RuntimeSystemContext which provides a context interface for the operators to
send their output to
# OperatorRoutingContext which provides the connection interface between the
operators
In the example, we have enabled two execution models via the above two context
classes:
# RoutableRuntimeContext that uses the routing information from a
OperatorRoutingContext and directly invoking the next operator when the current
operator send its output via the RoutableRuntimeContext
# StoredRuntimeContext that provides a storage for each operator's outputs that
stored the output when the current operator send its output via
StoredRuntimeContext. Then, it is up to the programmer to query the
StoredRuntimeContext to get the operator's output and proceed w/ the next steps
The first execution model allows the integration w/ future SQL parser and
planner to automatically run a task, while the second model allows a random
programmer to use the operators from the library in a random context.
> Identify the set of operators for SQL on Samza
> ----------------------------------------------
>
> Key: SAMZA-482
> URL: https://issues.apache.org/jira/browse/SAMZA-482
> Project: Samza
> Issue Type: Sub-task
> Reporter: Yi Pan (Data Infrastructure)
> Priority: Minor
> Labels: project
> Attachments: All class diagrams - v0.2.pdf, rb29592.patch
>
>
> This came out of a discussion between [~milinda], [~criccomini], and
> [~nickpan47]. We think that it will be a good idea to separate the operators
> layer from the high-level language layer, s.t. we can allow different
> languages to be built on-top-of the same set of fundamental functions (i.e.
> SQL-like or DSL).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)