[
https://issues.apache.org/jira/browse/SAMZA-483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279357#comment-14279357
]
Chris Riccomini commented on SAMZA-483:
---------------------------------------
bq. Depending on the DSL and internals of it, generating SQL like model may be
easier than relational model.
I think I like the SQL model for the backend. If we think the relational model
is easier to deal with for the semantic analysis, then we can translate from
relational OM to SQL OM when we pass frontend output to backend.
The other thing that I find appealing about the SQL model for the backend is
that it seems like we could eliminate specs if we have a SQL OM. Instead of
going SQL OM -> specs -> factory -> operator, I think that we could make the
factories directly take SQL OM fragments: SQL OM -> factory -> operator. I
think this might simplify the API a bit, in cases where a human being wants to
interact directly with the backend (to use operators directly from within a
StreamTask, as part of their regular Java code).
> A common representation of relational algebra for streaming SQL
> ----------------------------------------------------------------
>
> Key: SAMZA-483
> URL: https://issues.apache.org/jira/browse/SAMZA-483
> Project: Samza
> Issue Type: Sub-task
> Components: sql
> Reporter: Yi Pan (Data Infrastructure)
> Priority: Minor
> Labels: project
>
> Per discussion with [~criccomini] and [~milinda], we agreed that it seems to
> be a good idea to define a common representation of relational algebra on top
> of the operators defined in the operator layer (see SAMZA-482), which can be
> the common base that we can use to generate the description/configuration of
> a Samza job.
> This common layer can also be used by DSL-like language parser as a result of
> parsing a DSL program.
> Some additional requirements needed in addition to pure relational algebra:
> 1) the common representation should include window operators and stream
> operators (i.e. IStream/DStream/RStream)
> 2) the common representation should include description on parallelism of the
> jobs (i.e. how many partitions the resultant Samza job will use)
> Some references:
> http://web.cs.wpi.edu/~mukherab/i/DCAPE.pdf
> https://cs.uwaterloo.ca/~david/cs848/stream-cql.pdf
> http://davis.wpi.edu/dsrg/PROJECTS/CAPE/publications.htm
> http://davis.wpi.edu/dsrg/PROJECTS/CAPE/slides.htm
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)