Interesting feature, thanks Rui to bring the new option. Please keep me in loop, I’ll take a look when back to home tomorrow. It seems the chance to support other dialects, we see lots of concerns to translate from like SparkSQL.
Mingmin Sent from my iPhone > On Aug 4, 2019, at 2:43 PM, Rui Wang <ruw...@google.com> wrote: > > Hi David, > > That's a good point. I just add a section to discuss benefits in the doc > (link). > > > -Rui > >> On Sun, Aug 4, 2019 at 2:01 PM David Morávek <david.mora...@gmail.com> wrote: >> Hi Rui, >> >> This is definitely an interesting topic! Can you please elaborate little bit >> more about the benefits, that this will bring to the end user? All the >> documents only cover technical details and I'm still not sure what you're >> trying to achieve product-wise. >> >> Best, >> D. >> >>> On Sun, Aug 4, 2019 at 8:07 PM Rui Wang <ruw...@google.com> wrote: >>> I created a google doc to explain basic design on Beam ZetaSQL: >>> https://docs.google.com/document/d/14Yi4oEMzqS3n9-LfSNi6Q6kQpEP3gWTHzX0HxqUksdc/edit?usp=sharing >>> >>> >>> >>> -Rui >>> >>>> On Sun, Aug 4, 2019 at 10:02 AM Rui Wang <ruw...@google.com> wrote: >>>> Thanks Manu for you feedback! Some comments inlined: >>>> >>>> >>>> On Sat, Aug 3, 2019 at 8:41 PM Manu Zhang <owenzhang1...@gmail.com> wrote: >>>>>> A question to the community, does the size of the change require any >>>>>> process besides the usual PR reviews? >>>>> >>>>> I think so. This is a big change and has come as kind of a surprise >>>>> (sorry if I've missed previous discussions). >>>>> >>>>> Rui, could you explain more on how things will play out between BeamSQL >>>>> and ZetaSQL (A design doc including the pluggable interface would be >>>>> perfect). >>>> >>>> I see. I will have a document about some basic idea on Beam ZetaSQL (this >>>> is my way to call "ZetaSQL as a SQL dialect in BeamSQL", and I usually use >>>> Beam CalciteSQL to refer to Calcite's SQL dialect.). >>>> >>>> At least from users perspective, it's simple to use: setup planner name in >>>> BeamSqlPipelineOptions and BeamSQL will initialize different planners: >>>> either Calcite or ZetaSQL is supported now. >>>> >>>>> From GitHub, ZetaSQL is mainly in C++ so what you are doing is a port or >>>>> a connector to ZetaSQL? Do we need to depend on >>>>> https://github.com/google/zetasql ? ZetaSQL looks interesting but I could >>>>> barely find any doc for end users. >>>> >>>> ZetaSQL provides a Java interface which calls c++ binary through JNI. For >>>> using ZetaSQL in BeamSQL, we only need to depend on ZetaSQL jars in maven >>>> central (https://mvnrepository.com/search?q=zetasql). These jars contains >>>> all we need to call ZetaSQL analyzer by Java. >>>> >>>>> >>>>> Also, I'd prefer the PR to be split into two, one for the pluggable >>>>> interface and one for the ZetaSQL. >>>>> >>>> Pluggable planner is already a separate PR merged before: >>>> https://github.com/apache/beam/pull/7745 >>>> >>>> >>>> -Rui >>>> >>>> >>>>> Thanks, >>>>> Manu >>>>> >>>>> >>>>> >>>>>> On Sat, Aug 3, 2019 at 10:06 AM Ahmet Altay <al...@google.com> wrote: >>>>>> Thank you Rui for the heads up. >>>>>> >>>>>> A question to the community, does the size of the change require any >>>>>> process besides the usual PR reviews? >>>>>> >>>>>> On Fri, Aug 2, 2019 at 10:23 AM Rui Wang <ruw...@google.com> wrote: >>>>>>> Hi community, >>>>>>> >>>>>>> I have been working on supporting ZetaSQL[1] as a SQL dialect in >>>>>>> BeamSQL. ZetaSQL is a SQL analyzer open sourced by Google. Here is >>>>>>> ZetaSQL's documentation[2]. >>>>>>> >>>>>>> Birfely, the design of integrating ZetaSQL with BeamSQL is, I made a >>>>>>> plugable query planner interface in BeamSQL, and we can easily plug in >>>>>>> a new planner[3] (in my case, ZetaSQL planner). Actually anyone can add >>>>>>> new planners by this way (e.g. PostgreSQL dialect). >>>>>>> >>>>>>> I want to contribute ZetaSQL planner and its related code(~10k) to Beam >>>>>>> repo(#9210). This contribution barely touch existing Beam code (because >>>>>>> the idea is plugable planner). >>>>>>> >>>>>>> >>>>>>> Acknowledgement >>>>>>> Thanks to all the people who provided help during Beam ZetaSQL >>>>>>> development: Matthew Brown, Brian Hulette, Andrew Pilloud, Kenneth >>>>>>> Knowles, Anton Kedin and Mikhail Gryzykhin. This list is not exhausted >>>>>>> and also thanks to contributions which are not listed. >>>>>>> >>>>>>> >>>>>>> [1]: https://github.com/google/zetasql >>>>>>> [2]: https://github.com/google/zetasql/tree/master/docs >>>>>>> [3]: >>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/QueryPlanner.java >>>>>>> >>>>>>> >>>>>>> -Rui