Hi David, That's a good point. I just add a section to discuss benefits in the doc ( link <https://docs.google.com/document/d/14Yi4oEMzqS3n9-LfSNi6Q6kQpEP3gWTHzX0HxqUksdc/edit#heading=h.bkpdcuy05he> ).
-Rui On Sun, Aug 4, 2019 at 2:01 PM David Morávek <[email protected]> wrote: > Hi Rui, > > This is definitely an interesting topic! Can you please elaborate little > bit more about the benefits, that this will bring to the end user? All the > documents only cover technical details and I'm still not sure what you're > trying to achieve product-wise. > > Best, > D. > > On Sun, Aug 4, 2019 at 8:07 PM Rui Wang <[email protected]> wrote: > >> I created a google doc to explain basic design on Beam ZetaSQL: >> https://docs.google.com/document/d/14Yi4oEMzqS3n9-LfSNi6Q6kQpEP3gWTHzX0HxqUksdc/edit?usp=sharing >> >> >> >> -Rui >> >> On Sun, Aug 4, 2019 at 10:02 AM Rui Wang <[email protected]> wrote: >> >>> Thanks Manu for you feedback! Some comments inlined: >>> >>> >>> On Sat, Aug 3, 2019 at 8:41 PM Manu Zhang <[email protected]> >>> wrote: >>> >>>> A question to the community, does the size of the change require any >>>>> process besides the usual PR reviews? >>>>> >>>> >>>> I think so. This is a big change and has come as kind of a surprise >>>> (sorry if I've missed previous discussions). >>>> >>>> Rui, could you explain more on how things will play out between BeamSQL >>>> and ZetaSQL (A design doc including the pluggable interface would be >>>> perfect). >>>> >>> >>> I see. I will have a document about some basic idea on Beam ZetaSQL >>> (this is my way to call "ZetaSQL as a SQL dialect in BeamSQL", and I >>> usually use Beam CalciteSQL to refer to Calcite's SQL dialect.). >>> >>> At least from users perspective, it's simple to use: setup planner name >>> in BeamSqlPipelineOptions >>> <https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlPipelineOptions.java#L29> >>> and >>> BeamSQL will initialize different planners: either Calcite or ZetaSQL is >>> supported now. >>> >>> >>>> From GitHub, ZetaSQL is mainly in C++ so what you are doing is a port >>>> or a connector to ZetaSQL? Do we need to depend on >>>> https://github.com/google/zetasql ? ZetaSQL looks interesting but I >>>> could barely find any doc for end users. >>>> >>> >>> ZetaSQL provides a Java interface which calls c++ binary through JNI. >>> For using ZetaSQL in BeamSQL, we only need to depend on ZetaSQL jars in >>> maven central (https://mvnrepository.com/search?q=zetasql). These jars >>> contains all we need to call ZetaSQL analyzer by Java. >>> >>> >>>> Also, I'd prefer the PR to be split into two, one for the pluggable >>>> interface and one for the ZetaSQL. >>>> >>>> Pluggable planner is already a separate PR merged before: >>> https://github.com/apache/beam/pull/7745 >>> >>> >>> -Rui >>> >>> >>> >>>> Thanks, >>>> Manu >>>> >>>> >>>> >>>> On Sat, Aug 3, 2019 at 10:06 AM Ahmet Altay <[email protected]> wrote: >>>> >>>>> Thank you Rui for the heads up. >>>>> >>>>> A question to the community, does the size of the change require any >>>>> process besides the usual PR reviews? >>>>> >>>>> On Fri, Aug 2, 2019 at 10:23 AM Rui Wang <[email protected]> wrote: >>>>> >>>>>> Hi community, >>>>>> >>>>>> I have been working on supporting ZetaSQL[1] as a SQL dialect in >>>>>> BeamSQL. ZetaSQL is a SQL analyzer open sourced by Google. Here is >>>>>> ZetaSQL's documentation[2]. >>>>>> >>>>>> Birfely, the design of integrating ZetaSQL with BeamSQL is, I made a >>>>>> plugable query planner interface in BeamSQL, and we can easily plug in a >>>>>> new planner[3] (in my case, ZetaSQL planner). Actually anyone can add new >>>>>> planners by this way (e.g. PostgreSQL dialect). >>>>>> >>>>>> I want to contribute ZetaSQL planner and its related code(~10k) to >>>>>> Beam repo(#9210 <https://github.com/apache/beam/pull/9210>). This >>>>>> contribution barely touch existing Beam code (because the idea is >>>>>> plugable >>>>>> planner). >>>>>> >>>>>> >>>>>> *Acknowledgement* >>>>>> Thanks to all the people who provided help during Beam ZetaSQL >>>>>> development: Matthew Brown, Brian Hulette, Andrew Pilloud, Kenneth >>>>>> Knowles, >>>>>> Anton Kedin and Mikhail Gryzykhin. This list is not exhausted and also >>>>>> thanks to contributions which are not listed. >>>>>> >>>>>> >>>>>> [1]: https://github.com/google/zetasql >>>>>> [2]: https://github.com/google/zetasql/tree/master/docs >>>>>> [3]: >>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/QueryPlanner.java >>>>>> >>>>>> >>>>>> -Rui >>>>>> >>>>>
