Hi David,

That's a good point. I just add a section to discuss benefits in the doc (
link
<https://docs.google.com/document/d/14Yi4oEMzqS3n9-LfSNi6Q6kQpEP3gWTHzX0HxqUksdc/edit#heading=h.bkpdcuy05he>
).


-Rui

On Sun, Aug 4, 2019 at 2:01 PM David Morávek <[email protected]>
wrote:

> Hi Rui,
>
> This is definitely an interesting topic! Can you please elaborate little
> bit more about the benefits, that this will bring to the end user? All the
> documents only cover technical details and I'm still not sure what you're
> trying to achieve product-wise.
>
> Best,
> D.
>
> On Sun, Aug 4, 2019 at 8:07 PM Rui Wang <[email protected]> wrote:
>
>> I created a google doc to explain basic design on Beam ZetaSQL:
>> https://docs.google.com/document/d/14Yi4oEMzqS3n9-LfSNi6Q6kQpEP3gWTHzX0HxqUksdc/edit?usp=sharing
>>
>>
>>
>> -Rui
>>
>> On Sun, Aug 4, 2019 at 10:02 AM Rui Wang <[email protected]> wrote:
>>
>>> Thanks Manu for you feedback! Some comments inlined:
>>>
>>>
>>> On Sat, Aug 3, 2019 at 8:41 PM Manu Zhang <[email protected]>
>>> wrote:
>>>
>>>> A question to the community, does the size of the change require any
>>>>> process besides the usual PR reviews?
>>>>>
>>>>
>>>> I think so. This is a big change and has come as kind of a surprise
>>>> (sorry if I've missed previous discussions).
>>>>
>>>> Rui, could you explain more on how things will play out between BeamSQL
>>>> and ZetaSQL (A design doc including the pluggable interface would be
>>>> perfect).
>>>>
>>>
>>> I see. I will have a document about some basic idea on Beam ZetaSQL
>>> (this is my way to call "ZetaSQL as a SQL dialect in BeamSQL", and I
>>> usually use Beam CalciteSQL to refer to Calcite's SQL dialect.).
>>>
>>> At least from users perspective, it's simple to use: setup planner name
>>> in BeamSqlPipelineOptions
>>> <https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlPipelineOptions.java#L29>
>>>  and
>>> BeamSQL will initialize different planners: either Calcite or ZetaSQL is
>>> supported now.
>>>
>>>
>>>> From GitHub, ZetaSQL is mainly in C++ so what you are doing is a port
>>>> or a connector to ZetaSQL? Do we need to depend on
>>>> https://github.com/google/zetasql ? ZetaSQL looks interesting but I
>>>> could barely find any doc for end users.
>>>>
>>>
>>> ZetaSQL provides a Java interface which calls c++ binary through JNI.
>>> For using ZetaSQL in BeamSQL, we only need to depend on ZetaSQL jars in
>>> maven central (https://mvnrepository.com/search?q=zetasql). These jars
>>> contains all we need to call ZetaSQL analyzer by Java.
>>>
>>>
>>>> Also, I'd prefer the PR to be split into two, one for the pluggable
>>>> interface and one for the ZetaSQL.
>>>>
>>>> Pluggable planner is already a separate PR merged before:
>>> https://github.com/apache/beam/pull/7745
>>>
>>>
>>> -Rui
>>>
>>>
>>>
>>>> Thanks,
>>>> Manu
>>>>
>>>>
>>>>
>>>> On Sat, Aug 3, 2019 at 10:06 AM Ahmet Altay <[email protected]> wrote:
>>>>
>>>>> Thank you Rui for the heads up.
>>>>>
>>>>> A question to the community, does the size of the change require any
>>>>> process besides the usual PR reviews?
>>>>>
>>>>> On Fri, Aug 2, 2019 at 10:23 AM Rui Wang <[email protected]> wrote:
>>>>>
>>>>>> Hi community,
>>>>>>
>>>>>> I have been working on supporting ZetaSQL[1] as a SQL dialect in
>>>>>> BeamSQL. ZetaSQL is a SQL analyzer open sourced by Google. Here is
>>>>>> ZetaSQL's documentation[2].
>>>>>>
>>>>>> Birfely, the design of integrating ZetaSQL with BeamSQL is, I made a
>>>>>> plugable query planner interface in BeamSQL, and we can easily plug in a
>>>>>> new planner[3] (in my case, ZetaSQL planner). Actually anyone can add new
>>>>>> planners by this way (e.g. PostgreSQL dialect).
>>>>>>
>>>>>> I want to contribute ZetaSQL planner and its related code(~10k) to
>>>>>> Beam repo(#9210 <https://github.com/apache/beam/pull/9210>). This
>>>>>> contribution barely touch existing Beam code (because the idea is 
>>>>>> plugable
>>>>>> planner).
>>>>>>
>>>>>>
>>>>>> *Acknowledgement*
>>>>>> Thanks to all the people who provided help during Beam ZetaSQL
>>>>>> development: Matthew Brown, Brian Hulette, Andrew Pilloud, Kenneth 
>>>>>> Knowles,
>>>>>> Anton Kedin and Mikhail Gryzykhin. This list is not exhausted and also
>>>>>> thanks to contributions which are not listed.
>>>>>>
>>>>>>
>>>>>> [1]: https://github.com/google/zetasql
>>>>>> [2]: https://github.com/google/zetasql/tree/master/docs
>>>>>> [3]:
>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/QueryPlanner.java
>>>>>>
>>>>>>
>>>>>> -Rui
>>>>>>
>>>>>

Reply via email to