One possibility I’m eager to explore is taking the SQL parsing capabilities in storm-sql (i.e. Calcite) and applying it to SQE. Since SQE syntax is already “SQL-like” it should(?) be relatively straightforward to do so. That might be the fastest path to a production ready SQL API.
Anyway the discussion has died down so I’ll initiate a VOTE. -Taylor > On Sep 6, 2016, at 12:33 AM, Satish Duggana <[email protected]> wrote: > > Agree with Jungtaek on the below. > > - Better to support SQL instead of SQL like (SQL like creates > confusions). We are using Apache Calcite, we should continue with that. > - Currently trident is used but we should move to windowing abstractions > later for specifying boundedness to run the queries. > > We should not have two APIs for SQL(storm-sql) and SQL like(SQE APIs). > I am +1 for having this code integrated with storm-sql project and expose > the respective APIs in Storm-SQL. > > Thanks, > Satish. > > > On Tue, Sep 6, 2016 at 6:42 AM, Jungtaek Lim <[email protected]> wrote: > >> Thanks JW Player folks to come in and express your support. I can see the >> sponsors of SQE which makes me feel that SQE is nice enough. And also I >> agree "production-ready" is a great point to value. >> >> I have been positive to merge this in, just wondering how we merge Storm >> SQL and SQE for exposing better interface to users. >> Supporting SQL-like (DSL) is not same as supporting SQL, and some of >> competitor projects are already supporting SQL. I think this is not a thing >> to compromise. Since both Storm SQL and SQE are based on Trident there >> should be no notable hurdle to add SQE features to Storm SQL. If making SQE >> to support SQL is easier, I'm also positive to go ahead that direction. >> (Btw, my vision for Storm SQL is moving to core - tuple based - with >> windowing, not relying on Trident. It might be after that Storm introduces >> higher-level API for Beam or another purposes.) >> >> - Jungtaek Lim (HeartSaVioR) >> >> 2016년 9월 3일 (토) 오전 1:42, Douglas Shore <[email protected]>님이 작성: >> >>> We have benefited greatly from being downstream from SQE in powering our >>> data driven solutions. >>> >>> I am excited to see this repo grow in breadth and depth. >>> >>> >>> On Fri, Sep 2, 2016 at 11:16 AM, Kamil Sindi <[email protected]> wrote: >>> >>>> Our data science efforts rely on SQE to power our recommendations >>> engine. I >>>> am also excited to contribute to it especially as we continue to >>> implement >>>> predictive models at larger scales. >>>> >>>> On Fri, Sep 2, 2016 at 10:57 AM, Sahil Shah <[email protected]> >> wrote: >>>> >>>>> I would like to throw my support behind SQE. Having working with it >> in >>> a >>>>> production environment, I have seen the many benefits in testing new >>>>> topologies and quickly understanding what a topology is doing. As our >>>> data >>>>> needs have grown, we have only increased our reliance on SQE and it >>>> stands >>>>> the test repeatedly. I am excited at the opportunity to contribute to >>>> this >>>>> wonderful open source community. >>>>> >>>>> On Fri, Sep 2, 2016 at 10:31 AM, Alex Halter <[email protected]> >>> wrote: >>>>> >>>>>> I too want to voice my support for SQE and our commitment to the >>>>> initiative >>>>>> going forward. We've been working on adapting Storm to our needs >> for >>>> most >>>>>> of two years. It was thoughtfully designed and supports our >>> production >>>>>> needs. We have a long list of features we want to build out and >> we'd >>>> love >>>>>> to work with the community. >>>>>> >>>>>> >>>>>> On Fri, Sep 2, 2016 at 10:19 AM, Rohit Garg < >> [email protected]> >>>>>> wrote: >>>>>> >>>>>>> I am one of the developers who has been working on SQE for past >> 1.5 >>>>>> years. >>>>>>> Over time, we have made it more stable and production ready. >>>>>>> >>>>>>> As of now, one can easily scale SQE for more production data with >>>> easy >>>>>>> config changes and re-deploy, aggregate across different >> dimensions >>>> by >>>>>>> writing json like sql, write to different state stores and most >>>>>>> importantly, address new feature requirements really quick.(Since >>>> it's >>>>>> just >>>>>>> writing a sql like json file and sqe handles everything for you >> ! ) >>>>>>> >>>>>>> I think SQE can really help companies who want to setup a >>> production >>>>>> ready >>>>>>> and well tested framework within weeks (instead of months) for >>> large >>>>>> scale >>>>>>> event stream processing and with minimum risks and limited >>> resources. >>>>> We >>>>>>> are actively working on SQE to make it more awesome and are >>> committed >>>>> to >>>>>>> make the experience of developing a highly scalable and fault >>>> tolerant >>>>>>> stream processing framework more seamless and less stressful !!!! >>>>>>> >>>>>>> On Fri, Sep 2, 2016 at 9:49 AM, Lee Morris <[email protected]> >>> wrote: >>>>>>> >>>>>>>> Hi, Storm Dev! >>>>>>>> >>>>>>>> I wanted to chime in to show support for SQE and show how >>> committed >>>>> we >>>>>>> are >>>>>>>> to SQE. *StormSQL looks awesome and has some real potential! * >>>>>>>> >>>>>>>> We use SQE in production. It has been tested, code reviewed, >> load >>>>>> tested, >>>>>>>> maintained, and processing an average of 8 million tuples per >>>> minute >>>>> or >>>>>>>> more for over a year now. The investment into this code base >> has >>>> been >>>>>>>> significant. >>>>>>>> >>>>>>>> Please take a look at the code itself. The production quality >>> code >>>> is >>>>>>> ready >>>>>>>> to go. Developers with no experience with Storm or even >> streaming >>>>>>>> successfully launch robust topologies using SQE. Our >>> productivity >>>> in >>>>>>> this >>>>>>>> area went up by orders of magnitude. >>>>>>>> >>>>>>>> Based on this experience we realized the value of querying >> storm, >>>> and >>>>>> we >>>>>>>> decided to give that value back to the storm community. >>>>>>>> >>>>>>>> Our data pipelines and real-time processing are very important >> to >>>> the >>>>>>>> success of JW Player. SQE has been a foundation for that. We >> will >>>>>>> continue >>>>>>>> to invest into this technology for years to come. Unfortunately >>> we >>>>>>> wouldn't >>>>>>>> be able to adopt StormSQL as is until it has been put through >> the >>>>>>> crucible >>>>>>>> of production level usage and has had the same rigor applied. >> It >>>>> seems >>>>>>> much >>>>>>>> of the development has been over the last couple of weeks. >>>>>>>> >>>>>>>> *Quick Gap Analysis (Not Exhaustive)* >>>>>>>> *States* >>>>>>>> - SQE supports Redis and MongoDB as states in addition to >>> Kafka. >>>>>> (Soon >>>>>>>> adding a Test/Monitor State) >>>>>>>> - SQE supports non-static field names for Redis state >>>>>>>> - Storm SQL supports Kafka >>>>>>>> - SQE supports replay filtering for Kafka >>>>>>>> >>>>>>>> *Aggregations* >>>>>>>> - SQE supports stateful, exactly-once aggregations for states >>>> that >>>>>>>> support it >>>>>>>> - Storm SQL supports aggregations within each micro batch >>>>>>>> >>>>>>>> *SQL* >>>>>>>> - StormSQL supports SQL >>>>>>>> - SQE supports SQL "like" JSON >>>>>>>> >>>>>>>> *Scaling* >>>>>>>> - SQE has a mechanism for controlling parallelism or scaling >>>>>>>> - Could not find parallelism or scaling controls within >>> StormSQL >>>>> (May >>>>>>>> need to look harder) >>>>>>>> >>>>>>>> *Support for SQE* >>>>>>>> So far the SQE / JW Player developers have been watching this >>>> thread >>>>>>>> without knowing if we should chime in. I call upon the devs at >> JW >>>> to >>>>>>> chime >>>>>>>> in because we are dedicated to the success of this SQL in >> Storm. >>>>>>>> >>>>>>>> (Noticed I said "chime" three times in this email... well now >>> four >>>>>> times) >>>>>>>> >>>>>>>> Thanks for reading, >>>>>>>> >>>>>>>> Lee Morris, Sr Principal Engineer, Data | JWPLAYER >>>>>>>> >>>>>>>> O: 212.244.0140 <212.244.0140%20x999> | M: 215.920.1331 >>>>>>>> >>>>>>>> 2 Park Avenue, 10th Floor North, New York NY 10016 >>>>>>>> >>>>>>>> jwplayer.com | @jwplayer <http://twitter.com/jwplayer> >>>>>>>> >>>>>>>> On Tue, Aug 30, 2016 at 5:46 PM, Jungtaek Lim < >> [email protected] >>>> >>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Morrigan, >>>>>>>>> >>>>>>>>> Thanks for joining discussion. I thought we need to hear your >>>> goal >>>>> to >>>>>>>>> donate SQE code, and opinion for how to apply SQE to Storm >> SQL >>>> and >>>>>>>> working >>>>>>>>> on further improvements. >>>>>>>>> >>>>>>>>> Not sure when you took a look at the feature set of Storm >> SQL, >>>> but >>>>> if >>>>>>> you >>>>>>>>> haven't recently, you may want to do that. >>>>>>>>> I started working on improving Storm SQL several weeks ago, >> and >>>>> many >>>>>>>> things >>>>>>>>> are addressed in recent weeks. >>>>>>>>> >>>>>>>>> * STORM-1435 <https://issues.apache.org/ >> jira/browse/STORM-1435 >>>> : >>>>> You >>>>>>> can >>>>>>>>> easily launch Storm SQL runner without concerning >> dependencies >>>> for >>>>>>> Storm >>>>>>>>> SQL core and runtime. It wasn't easy to run before STORM-2016 >>>>>>>>> <http://issues.apache.org/jira/browse/STORM-2016> is >>> introduced. >>>>>>>>> * Refactored Storm SQL code for Trident to fit to Trident >>>>> operations. >>>>>>>> Storm >>>>>>>>> SQL parsed SQL and generated topology code but it was not >> easy >>> to >>>>>> know >>>>>>>> how >>>>>>>>> topology code is generated, and also hard to determine how >>>> Trident >>>>>>>>> optimizations are applied. >>>>>>>>> * STORM-1434 <https://issues.apache.org/ >> jira/browse/STORM-1434 >>>> , >>>>>>>>> STORM-2050 >>>>>>>>> <https://issues.apache.org/jira/browse/STORM-2050>: >> Addressed >>>>> GROUP >>>>>> BY >>>>>>>>> with >>>>>>>>> UDAF (User Defined Aggregate Function) on Trident mode. Storm >>> SQL >>>>>>> already >>>>>>>>> supported UDF on Trident mode. >>>>>>>>> * STORM-2057 <https://issues.apache.org/ >> jira/browse/STORM-2057 >>>> : >>>>>> JOIN >>>>>>>>> (inner, left outer, right outer, full outer) feature is now >> on >>>>>>> reviewing. >>>>>>>>> Note that only equi-join is supported. >>>>>>>>> >>>>>>>>> The changes are not included to official release yet, but I >>>> expect >>>>>>> Storm >>>>>>>>> 1.1.0 will include them which are worth to try out for early >>>>>> adopters. >>>>>>>>> >>>>>>>>> You can also refer STORM-1433 >>>>>>>>> <https://issues.apache.org/jira/browse/STORM-1433> for >> current >>>>> phase >>>>>>> of >>>>>>>>> Storm SQL. Might need to have another phases (epics) for >>>> resolving >>>>>>> other >>>>>>>>> issues as well. >>>>>>>>> >>>>>>>>> I only had a look at SQE wiki so don't know the detailed >>> features >>>>> of >>>>>>> SQE, >>>>>>>>> but my feeling is that recent changes fills the gap between >> SQE >>>> and >>>>>>> Storm >>>>>>>>> SQL, and even addressing some TODOs of SQE. We might need to >>>> cross >>>>>>> check >>>>>>>>> feature set of each project to make clear on pros and cons >> for >>>> each >>>>>>>>> project. >>>>>>>>> >>>>>>>>> Btw, while Storm SQL has been implemented its missing >> features, >>>> the >>>>>>>>> difficult part for Storm SQL is SQL optimizations. There >> seems >>>> lots >>>>>> of >>>>>>>> SQL >>>>>>>>> optimizations (like filter pushdown) but I'm not expert on >> that >>>> and >>>>>> it >>>>>>>>> apparently needs more deep understanding of Calcite. Other >>> parts >>>>> also >>>>>>>> need >>>>>>>>> contributors but we strongly need contributors in this area. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Jungtaek Lim (HeartSaVioR) >>>>>>>>> >>>>>>>>> 2016년 8월 31일 (수) 오전 12:47, Morrigan Jones < >>> [email protected] >>>>>> 님이 >>>>>>> 작성: >>>>>>>>> >>>>>>>>>> Hi, I'm the original creator and primary developer of SQE. >>>> Sorry >>>>>> for >>>>>>>>>> the radio silence on my part, I was out on vacation the >> past >>>> two >>>>>>>>>> weeks. >>>>>>>>>> >>>>>>>>>> I'm glad to see the Storm SQL project chugging along. I >>> started >>>>> SQE >>>>>>>>>> because I wanted better tools on top of Storm, particularly >>> the >>>>>>>>>> ability to query streams and build topologies using SQL. >> Our >>>>>>>>>> philosophy is to quickly iterate on our production systems >>> and >>>>>>> provide >>>>>>>>>> immediate value. We've been able to do this with SQE, which >>>>> powers >>>>>>> our >>>>>>>>>> streaming systems. Work on SQE and adding functions is >> driven >>>> by >>>>>> our >>>>>>>>>> current use cases. The big near term item on our road map >> is >>> to >>>>> add >>>>>>>>>> SQL parsing. Calcite is very promising there and brings >> lots >>> of >>>>>>>>>> additional features, as I'm sure you know. Additionally, >>> we're >>>>>> going >>>>>>>>>> to improve our function, stream, and state support. >>>>>>>>>> >>>>>>>>>> The difficulty I can see for us with Storm SQL is the >> amount >>> of >>>>>> work >>>>>>>>>> necessary to get from where we are now with SQE to >>> integrating >>>>> any >>>>>>>>>> functionality and making sure Storm SQL can provide the >>>>>> functionality >>>>>>>>>> we have now, assuming that is the path we would all go. >> We're >>>>> super >>>>>>>>>> excited to see support for Storm grow and mature, and we'd >>> like >>>>> to >>>>>> be >>>>>>>>>> a part of that. But we also have to maintain our ability to >>>>> iterate >>>>>>>>>> quickly and provide immediate value. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Morrigan Jones >>>>>>>>>> Principal Engineer >>>>>>>>>> JWPLAYER | Your Way to Play >>>>>>>>>> [email protected] | jwplayer.com >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> *Sahil Shah,* Data Engineer >>>>> *JW*PLAYER | Your Way to Play >>>>> P: 240.595.1169 | jwplayer.com >>>>> >>>> >>> >>> >>> >>> -- >>> *Doug Shore* >>> Senior Data Engineer >>> JW PLAYER | Your Way to Play >>> >>
signature.asc
Description: Message signed with OpenPGP using GPGMail
