[
https://issues.apache.org/jira/browse/SAMZA-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249051#comment-14249051
]
Yi Pan (Data Infrastructure) commented on SAMZA-390:
----------------------------------------------------
My notes on Spark StreamSQL after a quick check on:
https://issues.apache.org/jira/secure/attachment/12637803/StreamSQLDesignDoc.pdf
# Spark Stream SQL adopts [SQLstream|http://www.sqlstream.com/docs]'s syntax.
Some of the extension on stream operators in SQLstream are not as SQL-ish like
StreamSQL syntax, and it is claimed to do it "deliberately" in the online doc.
# It seems that the Spark StreamSQL does not have a time-window syntax
implemented yet. From SPARK-1363, time-window syntax is planned for phase two.
# It is not clear to me how the windowing technique works across the RDD
boundaries in DStream.Mini-batches of RDD are not exactly the same as a
continuous stream w/ the atomic unit of computation as a single tuple from the
stream.
> High-Level Language for Samza
> -----------------------------
>
> Key: SAMZA-390
> URL: https://issues.apache.org/jira/browse/SAMZA-390
> Project: Samza
> Issue Type: New Feature
> Reporter: Raul Castro Fernandez
> Priority: Minor
> Labels: project
>
> Discussion about high-level languages to define Samza queries. Queries are
> defined in this language and transformed to a dataflow graph where the nodes
> are Samza jobs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)