Re: SPARK-SQL: Pattern Detection on Live Event or Archived Event Data

2016-03-02 Thread Reynold Xin
SQL is very common and even some business analysts learn them. Scala and Python are great, but the easiest language to use is often the languages a user already knows. And for a lot of users, that is SQL. On Wednesday, March 2, 2016, Jerry Lam wrote: > Hi guys, > > FYI...

Re: SPARK-SQL: Pattern Detection on Live Event or Archived Event Data

2016-03-02 Thread Jerry Lam
Hi guys, FYI... this wiki page (StreamSQL: https://en.wikipedia.org/wiki/StreamSQL) has some histories related Event Stream Processing and SQL. Hi Steve, It is difficult to ask your customers that they should learn a new language when they are not programmers :) I don't know where/why they

Re: SPARK-SQL: Pattern Detection on Live Event or Archived Event Data

2016-03-01 Thread Jerry Lam
Hi Reynold, You are right. It is about the audience. For instance, in many of my cases, the SQL style is very attractive if not mandatory for people with minimum programming knowledge. SQL has its place for communication. Last time I show someone spark dataframe-style, they immediately said it is

Re: SPARK-SQL: Pattern Detection on Live Event or Archived Event Data

2016-03-01 Thread Reynold Xin
There are definitely pros and cons for Scala vs SQL-style CEP. Scala might be more powerful, but the target audience is very different. How much usage is there for a CEP style SQL syntax in practice? I've never seen it coming up so far. On Tue, Mar 1, 2016 at 9:35 AM, Alex Kozlov

Re: SPARK-SQL: Pattern Detection on Live Event or Archived Event Data

2016-03-01 Thread Jerry Lam
Hi Henri, Finally, there is a good reason for me to use Flink! Thanks for sharing this information. This is exactly the solution I'm looking for especially the ticket references a paper I was reading a week ago. It would be nice if Flink adds support SQL because this makes business analyst

Re: SPARK-SQL: Pattern Detection on Live Event or Archived Event Data

2016-03-01 Thread Henri Dubois-Ferriere
fwiw Apache Flink just added CEP. Queries are constructed programmatically rather than in SQL, but the underlying functionality is similar. https://issues.apache.org/jira/browse/FLINK-3215 On 1 March 2016 at 08:19, Jerry Lam wrote: > Hi Herman, > > Thank you for your

Re: SPARK-SQL: Pattern Detection on Live Event or Archived Event Data

2016-03-01 Thread Jerry Lam
Hi Herman, Thank you for your reply! This functionality usually finds its place in financial services which use CEP (complex event processing) for correlation and pattern matching. Many commercial products have this including Oracle and Teradata Aster Data MR Analytics. I do agree the syntax a

Re: SPARK-SQL: Pattern Detection on Live Event or Archived Event Data

2016-03-01 Thread Herman van Hövell tot Westerflier
Hi Jerry, This is not on any roadmap. I (shortly) browsed through this; and this looks like some sort of a window function with very awkward syntax. I think spark provided better constructs for this using dataframes/datasets/nested data... Feel free to submit a PR. Kind regards, Herman van

SPARK-SQL: Pattern Detection on Live Event or Archived Event Data

2016-03-01 Thread Jerry Lam
Hi Spark developers, Will you consider to add support for implementing "Pattern matching in sequences of rows"? More specifically, I'm referring to this: http://web.cs.ucla.edu/classes/fall15/cs240A/notes/temporal/row-pattern-recogniton-11.pdf This is a very cool/useful feature to pattern