from:"Danny Chan"

Re: [ANNOUNCE] New Calcite PMC chair: Ruben Q L

2022-01-20 Thread Danny Chan

Congratulations, Ruben, and good luck!

Haisheng, Thank you for serving as Chair.

Yanjing Wang 于2022年1月21日 周五下午12:16写道：

> Congrats Ruben!
>
> Stamatis Zampetakis  于2022年1月21日周五 06:34写道：
>
> > Congrats Ruben! You're kind, fair, and knowledgeable, very well deserved.
> >
> > Many thanks for serving as a chair Haisheng.
> >
> > Best,
> > Stamatis
> >
> > On Thu, Jan 20, 2022, 12:41 PM Forward Xu 
> wrote:
> >
> > > Congratulations to Ruben!  Thanks for serving as Chair during the last
> > > year, Haisheng!
> > >
> > >
> > > forward
> > >
> > > Ruben Q L  于2022年1月20日周四 19:38写道：
> > >
> > > > Thanks everyone!
> > > > And thank you Haisheng for being our PMC Chair during last year!
> > > >
> > > >
> > > >
> > > > On Thu, Jan 20, 2022 at 8:52 AM Alessandro Solimando <
> > > > alessandro.solima...@gmail.com> wrote:
> > > >
> > > > > Congratulations to Ruben and thanks a lot to Haisheng!
> > > > >
> > > > > On Thu, 20 Jan 2022 at 08:13, 953396112 <13282155...@qq.com.invalid
> >
> > > > > wrote:
> > > > >
> > > > > > Congratulations to Ruben!
> > > > > > Thanks for serving as Chair, Haisheng!
> > > > > >
> > > > > >
> > > > > > Best regards,
> > > > > > Zhaohui Xu
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --原始邮件--
> > > > > > 发件人:
> > > > > >   "dev"
> > > > > > <
> > > > > > hy...@apache.org;
> > > > > > 发送时间:2022年1月20日(星期四) 上午10:29
> > > > > > 收件人:"dev" > > > > >
> > > > > > 主题:[ANNOUNCE] New Calcite PMC chair: Ruben Q L
> > > > > >
> > > > > >
> > > > > >
> > > > > > Calcite community members,
> > > > > >
> > > > > > I am pleased to announce that we have a new PMC chair and VP as
> per
> > > our
> > > > > > tradition of rotating the chair once a year. I have resigned, and
> > > Ruben
> > > > > Q L
> > > > > > was duly elected by the PMC and approved unanimously by the
> Board.
> > > > > >
> > > > > > Please join me in congratulating Ruben!
> > > > > >
> > > > > > Best,
> > > > > > Haisheng Yuan
> > > > >
> > > >
> > >
> >
>

Re: [ANNOUNCE] New committer: Zhaohui Xu

2021-10-17 Thread Danny Chan

Congratulations !

Danny

XING JIN  于2021年10月15日周五 下午5:34写道：

> Congratulations !
>
> Jin
>
> Michael Mior  于2021年10月15日周五 上午12:56写道：
>
> > Welcome Zhaohui!
> > --
> > Michael Mior
> > mm...@apache.org
> >
> > Le mer. 6 oct. 2021 à 16:48, Stamatis Zampetakis  a
> > écrit :
> > >
> > > Apache Calcite's Project Management Committee (PMC) has invited Zhaohui
> > Xu
> > > to
> > > become a committer, and we are pleased to announce that they have
> > accepted.
> > >
> > > Numbers speak for themselves and Zhaohui has over 30 commits already in
> > > master
> > > and more than 20 open pull requests waiting to get in. Great record so
> > far
> > > including
> > > (but not limited to) improvements and fixes in the view based
> > > rewriting modules,
> > > JSON serialization, metadata, and field trimming.
> > >
> > > Zhaohui, welcome, thank you for your contributions, and we look forward
> > to
> > > your
> > > further interactions with the community! If you wish, please feel free
> to
> > > tell
> > > us more about yourself and what you are working on.
> > >
> > > Stamatis (on behalf of the Apache Calcite PMC)
> >
>

Re: [ANNOUNCE] New committer: Vladimir Ozerov

2021-06-23 Thread Danny Chan

Congrats, Vladimir!

Best,
Danny Chan

Yanjing Wang  于2021年6月24日周四 上午11:41写道：

> Congrats, Vladimir!
>
> Roman Kondakov  于2021年6月24日周四 上午11:22写道：
>
> > Congratulations, Vladimir!
> >
> > Roman Kondakov
> >
> > On 24.06.2021 12:23, 段雄 wrote:
> > > Congratulations!
> > >
> > > XING JIN  于2021年6月24日周四 上午10:21写道：
> > >
> > >> Congratulations ~
> > >>
> > >> Best,
> > >> Jin
> > >>
> > >> guangyuan wang  于2021年6月24日周四 上午9:50写道：
> > >>
> > >>> Congratulations!
> > >>>
> > >>> Francis Chuang  于2021年6月24日周四 上午6:39写道：
> > >>>
> > >>>> Congrats, Vladimir!
> > >>>>
> > >>>> Francis
> > >>>>
> > >>>> On 24/06/2021 7:48 am, Haisheng Yuan wrote:
> > >>>>> Congratulations and thanks for your contributions, Vladimir!
> > >>>>>
> > >>>>> Regards,
> > >>>>> Haisheng
> > >>>>>
> > >>>>> On 2021/06/23 21:34:40, Stamatis Zampetakis 
> > >> wrote:
> > >>>>>> Apache Calcite's Project Management Committee (PMC) has invited
> > >>> Vladimir
> > >>>>>> Ozerov to
> > >>>>>> become a committer, and we are pleased to announce that he has
> > >>> accepted.
> > >>>>>>
> > >>>>>> Vladimir is among the few people who know very well the internal
> > >>>> workings
> > >>>>>> of the
> > >>>>>> Calcite optimizer. He started and participated in many discussions
> > >>> about
> > >>>>>> the core engine and contributed ideas and code for making it
> better.
> > >>>>>> Moreover, Vladimir has blogged and talked about Calcite in various
> > >>>>>> conferences and meetups giving publicity and showcasing the
> > >>>> capabilities of
> > >>>>>> the project.
> > >>>>>>
> > >>>>>> Vladimir, welcome, thank you for your contributions, and we look
> > >>>> forward to
> > >>>>>> your
> > >>>>>> further interactions with the community! If you wish, please feel
> > >> free
> > >>>> to
> > >>>>>> tell
> > >>>>>> us more about yourself and what you are working on.
> > >>>>>>
> > >>>>>> Stamatis (on behalf of the Apache Calcite PMC)
> > >>>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
>

Re: Question: Hints & Rules

2021-05-27 Thread Danny Chan

Jia Tao ~
The hints is production ready, we have implements many hints internal for
our Flink engine, such as the table hints and query hints.

The `RelBuilder#project` was modified to support hints already, one reason
for hints lost is that the projection are dropped for some reason(
maybe some optimization), and you should fix it if you wanna keep that.

Best,
Danny Chan

JiaTao Tao  于2021年5月27日周四 下午4:22写道：

> Hi
> Indeed, there's some situation will lose hints, like RelBuilder#project, we
> just construct a project without hints(ImmutableList.of()), so it will lose
> hints too.
> IMO, it's not ready for production, so finally, we give up this feature.
> Currently, hints occurs some times in the mail list, proves that this is a
> common usage, maybe we need to go over this feature again and make it
> product ready.
>
>
> Regards!
>
> Aron Tao
>
>
> Danny Chan  于2021年5月27日周四 上午11:01写道：
>
> > Hi ~
> >
> > We have interface RelHintsPropagator for hints propagation customization,
> > but you may need to modify the code of the rule a little bit.
> >
> > Best,
> > Danny Chan
> >
> > Taras Ledkov  于2021年5月26日周三 下午7:26写道：
> >
> > > Hi,
> > >
> > > I am trying to figure out how to use hints correctly.
> > >
> > > My case:
> > > I've registered hint for Aggregate node to force expand DISTINCT
> > aggregate.
> > > Because is some cases plan with DISTINCT aggregate to JOIN cannot be
> > > chosen by the cost.
> > >
> > > I see that different rules treat hints differently.
> > > e.g.:
> > > - AggregateExpandDistinctAggregatesRule - copy hints of the source node
> > > for new aggregate nodes.
> > > - AggregateReduceFunctionsRule - create new node and looses the
> original
> > > hints.
> > >
> > > Is is correct behavior and am I missing something at the hints/planner
> > > logic?
> > >
> > > --
> > > Taras Ledkov
> > > Mail-To: tled...@gridgain.com
> > >
> > >
> >
>

Re: Enumerable rules lost hint info

2021-05-27 Thread Danny Chan

We did have some hints for our physical rules to apply some strategies, the
very common case is the JOIN algorithm, such as the NESTED_LOOP,
SHUFFLE_HASH, SORT_MERGE and so on.

The only reason that I do not implement the physical hints for Calcite is
that I do not have enough time for that, sorry.

Jia Tao, what kind of physical hints do you expect to have for Calcite
physical operators ?

Best,
Danny Chan

Julian Hyde  于2021年5月20日周四 上午1:06写道：

> Sure, there’s a slot to hold the hints. But is there a benefit to keeping
> them in the physical algebra? There is certainly a cost to storing them,
> propagating them, fixing them up as we apply rules.
>
> The main source of hints is formatted comments in the SQL. As we get
> further from the SQL source and the AST, the benefit decreases.
>
> Danny,
>
> Were you envisioning hints being propagated into physical algebra? There’s
> no explicit mention in
> https://docs.google.com/document/d/1mykz-w2t1Yw7CH6NjUWpWqCAf_6YNKxSc59gXafrNCs/edit
> <
> https://docs.google.com/document/d/1mykz-w2t1Yw7CH6NjUWpWqCAf_6YNKxSc59gXafrNCs/edit>
> or https://issues.apache.org/jira/browse/CALCITE-482 <
> https://issues.apache.org/jira/browse/CALCITE-482>.
>
> Julian
>
>
> > On May 18, 2021, at 7:49 PM, JiaTao Tao  wrote:
> >
> > Hi Julian
> >
> > When hints were introduced, was that part of the plan?
> >
> > "hint" is a field in the base class, so the type of the node doesn't
> matter
> >
> > public abstract class Project extends SingleRel implements Hintable {
> >  //~ Instance fields
> 
> >
> >  protected final ImmutableList exps;
> >
> >  protected final ImmutableList hints;
> >
> >
> > And in RelBuilder#project, we just construct a project without
> > hints(ImmutableList.of()), so it will lose hints too.
> >
> >  public RelBuilder project(Iterable nodes,
> >  Iterable fieldNames, boolean force) {
> >return project_(nodes, fieldNames, ImmutableList.of(), force);
> >  }
> >
> >
> >
> > Regards!
> >
> > Aron Tao
> >
> >
> > Julian Hyde  于2021年5月19日周三 上午12:30写道：
> >
> >> Do we really need hints in physical operators? When hints were
> introduced,
> >> was that part of the plan?
> >>
> >> Julian
> >>
> >>> On May 17, 2021, at 11:50 PM, JiaTao Tao  wrote:
> >>>
> >>> Hi
> >>> Recently, I'm integrating Calcite's hint into our system, and I've
> found
> >>> the Enumerable rules just throw the hint info(ImmutableList.of()) away:
> >>>
> >>> public EnumerableProject(
> >>> RelOptCluster cluster,
> >>> RelTraitSet traitSet,
> >>> RelNode input,
> >>> List projects,
> >>> RelDataType rowType) {
> >>>   super(cluster, traitSet, ImmutableList.of(), input, projects,
> >> rowType);
> >>>   assert getConvention() instanceof EnumerableConvention;
> >>> }
> >>>
> >>>
> >>> Regards!
> >>>
> >>> Aron Tao
> >>
>
>

Re: Question: Hints & Rules

2021-05-26 Thread Danny Chan

Hi ~

We have interface RelHintsPropagator for hints propagation customization,
but you may need to modify the code of the rule a little bit.

Best,
Danny Chan

Taras Ledkov  于2021年5月26日周三 下午7:26写道：

> Hi,
>
> I am trying to figure out how to use hints correctly.
>
> My case:
> I've registered hint for Aggregate node to force expand DISTINCT aggregate.
> Because is some cases plan with DISTINCT aggregate to JOIN cannot be
> chosen by the cost.
>
> I see that different rules treat hints differently.
> e.g.:
> - AggregateExpandDistinctAggregatesRule - copy hints of the source node
> for new aggregate nodes.
> - AggregateReduceFunctionsRule - create new node and looses the original
> hints.
>
> Is is correct behavior and am I missing something at the hints/planner
> logic?
>
> --
> Taras Ledkov
> Mail-To: tled...@gridgain.com
>
>

Re: [ANNOUNCE] Apache Calcite Avatica 1.18.0 released

2021-05-24 Thread Danny Chan

Thanks for the nice work, Francis ~

Best,
Danny Chan

Francis Chuang  于2021年5月19日周三 上午11:52写道：

> The Apache Calcite team is pleased to announce the release of Apache
> Calcite Avatica 1.18.0.
>
> Avatica is a framework for building database drivers. Avatica defines a
> wire API and serialization mechanism for clients to communicate with a
> server as a proxy to a database. The reference Avatica client and server
> are implemented in Java and communicate over HTTP. Avatica is a
> sub-project of Apache Calcite.
>
> Apache Calcite Avatica 1.18.0 includes a few dependency updates, bug
> fixes and a breaking change. In addition, the Gradle wrapper has been
> removed from the source distribution in order to comply with ASF policy.
> This release has also been tested with JDK 15 and Guava 29. For a full
> list of changes, please see the release notes:
>
>https://calcite.apache.org/avatica/docs/history.html#v1-18-0
>
> The release is available here:
>
>https://calcite.apache.org/avatica/downloads/avatica.html
>
> We welcome your help and feedback. For more information on how to report
> problems and get involved, visit the project website at:
>
> https://calcite.apache.org/avatica/
>
> or the Apache Calcite project website:
>
> https://calcite.apache.org/
>
> Thanks to everyone involved!
>
> Francis Chuang, on behalf of the Apache Calcite team.
>

Re: Re-write right outer join as left outer join

2021-03-03 Thread Danny Chan

Do you mean the join reordering ?

You can take a look at LoptOptimizeJoinRule and JoinCommuteRule.

Abhishek Agarwal  于2021年3月3日周三 下午5:12写道：

> Hello everyone,
>
> is there an existing rule that can re-write a right outer join as a left
> outer join by swapping tables?
>

Re: [DISCUSS] Release Managers

2021-02-09 Thread Danny Chan

That’s reasonable, we did make many changes in year 2020. Hope the good
news for 2021.

Julian Hyde 于2021年2月8日 周一上午2:17写道：

> Release 2.0 is largely symbolic. We develop Calcite incrementally, so it’s
> more important that there are regular releases, and that they are backwards
> compatible (within reason), than what those releases are called.
>
> That said, a major release would be an opportunity to get rid of the cruft
> (deprecated APIs) that have built up over the years. We should do one at
> some point in 2021.
>
> Julian
>
>
> > On Feb 6, 2021, at 6:31 PM, Danny Chan  wrote:
> >
> > Just curious, when does Calcit plan to release 2.0, there are so many
> small
> > versions now ~ Do we really need to be 1.3x.0 or 1.4x.0 version ?
> >
> > Andrei Sereda  于2021年2月5日周五 上午2:34写道：
> >
> >> I'm happy to help with 1.30
> >>
> >> On Thu, Feb 4, 2021 at 1:27 PM Julian Hyde  wrote:
> >>
> >>> Yes, I'll be release manager for 1.28.
> >>>
> >>> I think I promised to be a release manager a while ago. I forget which
> >>> release. Sorry about that.
> >>>
> >>> On Thu, Feb 4, 2021 at 1:09 AM Stamatis Zampetakis 
> >>> wrote:
> >>>>
> >>>> That would be great Rui, thanks for volunteering. So far we have:
> >>>>
> >>>> 1.27 Stamatis
> >>>> 1.28 Julian ?
> >>>> 1.29 Rui
> >>>>
> >>>> It would be nice to also plan for 1.30 & 1.31 just to be a bit ahead
> of
> >>>> time. Are there other people willing to help?
> >>>>
> >>>> Best,
> >>>> Stamatis
> >>>>
> >>>>
> >>>> On Wed, Feb 3, 2021 at 12:32 AM Rui Wang 
> wrote:
> >>>>
> >>>>> I can also help on releasing (haven't done one for Calcite).
> >>>>>
> >>>>>
> >>>>> -Rui
> >>>>>
> >>>>> On Tue, Aug 18, 2020 at 2:32 PM Julian Hyde 
> >> wrote:
> >>>>>
> >>>>>> I haven't been RM for a while. I'll do 1.27, then Stamatis can do
> >>> 1.28.
> >>>>>>
> >>>>>> On Sun, Aug 16, 2020 at 1:57 AM Stamatis Zampetakis <
> >>> zabe...@gmail.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Depending on the timing I can possibly get 1.27.0 or 1.28.0.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Stamatis
> >>>>>>>
> >>>>>>> On Fri, Aug 14, 2020 at 2:58 AM Haisheng Yuan 
> >>>>> wrote:
> >>>>>>>
> >>>>>>>> Thanks for volunteering, Ruben! I think you can be the release
> >>>>> manager
> >>>>>> for
> >>>>>>>> 1.26.0.
> >>>>>>>>
> >>>>>>>> We still need 2 more volunteers for the next 2 versions.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Haisheng
> >>>>>>>>
> >>>>>>>> On 2020/07/25 15:24:43, Ruben Q L  wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> I can volunteer for one of them, for example 1.26.0
> >>>>>>>>>
> >>>>>>>>> Best regards,
> >>>>>>>>> Ruben
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Le sam. 25 juil. 2020 à 15:23, Haisheng Yuan <
> >> hy...@apache.org>
> >>> a
> >>>>>> écrit
> >>>>>>>> :
> >>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> Would anyone be interested in being release manager for
> >>> v1.26.0,
> >>>>>>>> v1.27.0
> >>>>>>>>>> or v1.28.0?
> >>>>>>>>>> We need 3 volunteers (must be PMC or committer) for these 3
> >>>>>> versions.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Haisheng Yuan
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
>
>

Re: [DISCUSS] Release Managers

2021-02-06 Thread Danny Chan

Just curious, when does Calcit plan to release 2.0, there are so many small
versions now ~ Do we really need to be 1.3x.0 or 1.4x.0 version ?

Andrei Sereda  于2021年2月5日周五 上午2:34写道：

> I'm happy to help with 1.30
>
> On Thu, Feb 4, 2021 at 1:27 PM Julian Hyde  wrote:
>
> > Yes, I'll be release manager for 1.28.
> >
> > I think I promised to be a release manager a while ago. I forget which
> > release. Sorry about that.
> >
> > On Thu, Feb 4, 2021 at 1:09 AM Stamatis Zampetakis 
> > wrote:
> > >
> > > That would be great Rui, thanks for volunteering. So far we have:
> > >
> > > 1.27 Stamatis
> > > 1.28 Julian ?
> > > 1.29 Rui
> > >
> > > It would be nice to also plan for 1.30 & 1.31 just to be a bit ahead of
> > > time. Are there other people willing to help?
> > >
> > > Best,
> > > Stamatis
> > >
> > >
> > > On Wed, Feb 3, 2021 at 12:32 AM Rui Wang  wrote:
> > >
> > > > I can also help on releasing (haven't done one for Calcite).
> > > >
> > > >
> > > > -Rui
> > > >
> > > > On Tue, Aug 18, 2020 at 2:32 PM Julian Hyde 
> wrote:
> > > >
> > > > > I haven't been RM for a while. I'll do 1.27, then Stamatis can do
> > 1.28.
> > > > >
> > > > > On Sun, Aug 16, 2020 at 1:57 AM Stamatis Zampetakis <
> > zabe...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > Depending on the timing I can possibly get 1.27.0 or 1.28.0.
> > > > > >
> > > > > > Best,
> > > > > > Stamatis
> > > > > >
> > > > > > On Fri, Aug 14, 2020 at 2:58 AM Haisheng Yuan 
> > > > wrote:
> > > > > >
> > > > > > > Thanks for volunteering, Ruben! I think you can be the release
> > > > manager
> > > > > for
> > > > > > > 1.26.0.
> > > > > > >
> > > > > > > We still need 2 more volunteers for the next 2 versions.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Haisheng
> > > > > > >
> > > > > > > On 2020/07/25 15:24:43, Ruben Q L  wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I can volunteer for one of them, for example 1.26.0
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > > Ruben
> > > > > > > >
> > > > > > > >
> > > > > > > > Le sam. 25 juil. 2020 à 15:23, Haisheng Yuan <
> hy...@apache.org>
> > a
> > > > > écrit
> > > > > > > :
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > Would anyone be interested in being release manager for
> > v1.26.0,
> > > > > > > v1.27.0
> > > > > > > > > or v1.28.0?
> > > > > > > > > We need 3 volunteers (must be PMC or committer) for these 3
> > > > > versions.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Haisheng Yuan
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> >
>

Re: [ANNOUNCE] New Calcite PMC chair: Haisheng Yuan

2020-12-17 Thread Danny Chan

Congratulations, Haisheng!

Thank you for your service, Stamatis.

Julian Hyde  于2020年12月18日周五 上午9:58写道：

> Congratulations, Haisheng!
>
> Thank you for your service, Stamatis.
>
> Julian
>
>
> > On Dec 17, 2020, at 5:48 AM, Stamatis Zampetakis 
> wrote:
> >
> > Calcite community members,
> >
> > I am pleased to announce that we have a new PMC chair and VP as per our
> > tradition of rotating the chair once a year. I have resigned, and
> > Haisheng was duly elected by the PMC and approved unanimously by the
> Board.
> >
> > Please join me in congratulating Haisheng!
> >
> > Best,
> > Stamatis
>
>

Re: [DISCUSS] State of the project 2020

2020-11-26 Thread Danny Chan

Oops, i just saw the mail ~
Thanks Stamatis Zampetakis for the whole year nice work, you are always
warm and nice to work with.

I'm also +1 for Haisheng being the PMC chair, he is experienced in
optimizer and i believe he would take Calcite to a better direction in the
following year ~

Also thanks all the contributors to Calcite, Calcite becomes more and more
popular because of your active contributions ~

Stamatis Zampetakis  于2020年11月25日周三 上午7:19写道：

> It seems that there is consensus to appoint Haisheng as the new chair. I
> will start a vote soon!
>
> There were many good ideas throughout this thread about the future of
> Calcite.
> I encourage people to take initiatives and start separate threads in order
> to push them forward.
>
> Best,
> Stamatis
>
> On Sat, Nov 21, 2020 at 4:08 AM Michael Mior  wrote:
>
> > Thanks Stamatis for serving as chair! I would also be happy to have
> > Haisheng for next year. I wonder if there are some additional
> > automated ways we can help with improving the likelihood of merging
> > tests. For example, auto commenting on PRs with suspected issues such
> > as missing tests, poor code style, etc. This would be possible to
> > implement with GitHub actions and could save some committer reviewing
> > time by allowing contributors to fix some issues before anyone looks
> > at it. This would certainly require some thought to be a help rather
> > than a hindrance but I think there's some potential there.
> >
> > --
> > Michael Mior
> > mm...@apache.org
> >
> > Le sam. 14 nov. 2020 à 18:15, Stamatis Zampetakis  a
> > écrit :
> > >
> > > Thanks to all those who shared their thoughts so far!
> > >
> > > Most of the suggestions around reviews and PRs sound promising.
> > Summarizing
> > > we have:
> > > * Adding component owners for reviews;
> > > * Cleaning up stale PRs;
> > > * Freezing master till every JIRA marked for the next release is
> > reviewed;
> > >
> > > Having component owners has been discussed in the past [1, 2] but
> > > the general feeling was against.
> > > I don't know if now things are different but we could definitely
> > follow-up
> > > in a separate discussion if people are interested.
> > >
> > > Stale PRs is an issue and becoming stale is what is more problematic.
> If
> > we
> > > don't review PRs in a timely manner then
> > > there is a high chance that the contributor will lose interest and the
> PR
> > > will never merge. In the past [3], we discussed the possibility of
> > adding a
> > > bot
> > > for helping us in this task but the idea was abandoned in the end. If
> > done
> > > right, I think it can be useful. For instance, sending a reminder on
> our
> > > dev list
> > > if the PR does not have any activity for a certain time could help us
> > maybe
> > > prioritize this over others.
> > >
> > > One simple thing that we can do without too much discussion is active
> > > committers to assign a number 5/10 PRs to themselves and we make sure
> > > that they are resolved (merged or closed) for the next release.
> > >
> > > Regarding the role of the PMC chair, there were three names mentioned
> so
> > > far: Haisheng, Danny, and Ruben. I believe in all of them!
> > > Depending on the interest from their side and the feedback from the
> rest
> > of
> > > the community we will proceed to a vote.
> > >
> > > Best,
> > > Stamatis
> > >
> > > [1]
> > >
> >
> https://lists.apache.org/thread.html/3b1f43b15306b2f1496fd73bda01dcb3a1dba08747b1267e1b5f8187%40%3Cdev.calcite.apache.org%3E
> > > [2]
> > >
> >
> https://lists.apache.org/thread.html/7231a09fc9d1bad52c229664e8cb2a3631314b34464c2df10c594745%40%3Cdev.calcite.apache.org%3E
> > > [3]
> > >
> >
> https://lists.apache.org/thread.html/e794ec7f36246ec918a409b80759efc30d5433eea08ee564d4ba0c67%40%3Cdev.calcite.apache.org%3E
> > >
> > >
> > > On Wed, Nov 11, 2020 at 10:19 PM Rui Wang 
> wrote:
> > >
> > > > Thanks Stamatis for serving as the PMC chair for Calcite in 2020!
> > > >
> > > > One of the highlights I want to give for the PR reviewing/merging
> area
> > was
> > > > the practice we started to adopt in recent releases: we started to
> > check
> > > > each JIRA marked with the right release version and see whether there
> > is a
> > > > PR in a good shape that can be merged. And we have built a contract
> > that
> > > > unless those promising PR are merged, we won't unlock the Calcite
> > master. I
> > > > feel like this practice has proven to be useful (and of course thanks
> > for
> > > > release managers who have been working on this!).
> > > >
> > > >
> > > > Finally, +1 on Haisheng being our next PMC chair.
> > > >
> > > >
> > > > -Rui
> > > >
> > > > On Wed, Nov 11, 2020 at 1:57 AM Ruben Q L  wrote:
> > > >
> > > > >  Hello,
> > > > >
> > > > > First of all, thanks Stamatis for this detailed "state of the
> > project",
> > > > and
> > > > > thanks for your great work as PMC chair throughout this year, I
> take
> > my
> > > > hat
> > > > > off.
> > > > >
> > > > > Regarding the PR issue, I

Re: Search/Sarg: untested feature merged to the default branch

2020-11-12 Thread Danny Chan

Stop the merge, PLEASE !!!

We all vote -1 and what's the problem there you still want to revert ?

What's the bug/problem there that need to be fixed? Can you log some issues
there ?

If the RexProgramFuzzyTest can not run, just fix it. From my local test, it
is not caused by the Sarg itself, it is the parallelism test that cause the
Sarg state unstable.

Why you always push the things that rejected by most of the fellows, i
don't understand.

Vladimir Sitnikov  于2020年11月13日周五 上午3:09写道：

> Ok, it took me even more time to stabilize the build than I initially
> expected as there were 4 misbehaving issues (4388, 4397, 4398, 4399)
>
> The PR that reverts SEARCH is https://github.com/apache/calcite/pull/2250
> I'll let the dust settle, and I merge it.
>
> Hopefully, SEARCH then could be re-added when it is ready.
>
> Vladimir
>

Re: Search/Sarg: untested feature merged to the default branch

2020-11-08 Thread Danny Chan

-1 for the revert, we should fix the issues we encountered instead of
reverting the code brainless for a whole release.

At lease, project like Apache Flink has upgrade to version 1.26 and the
Sarg feature overall looks good.

We are trying to fix the Sarg issues in version 1.27 and we should continue
with that.

So, logger the fail cases with issues and fix them, that is the way to go.

Vladimir Sitnikov  于2020年11月9日周一 上午7:41写道：

> Stamatis>People who are responsible for upgrading Calcite, tend to follow
> the dev
> Stamatis>list so they can take the necessary actions.
>
> I think behind the lines of updating https://calcite.apache.org/news/
>
> We might want to mention the following regressions, and we might want to
> mark the release as unstable/broken
> since there's no workaround:
> [CALCITE-4352] RexSimplify incorrectly drops IS NULL and IS NOT NULL from
> SEARCH expressions
> [CALCITE-4325] RexSimplify incorrectly simplifies complex expressions that
> contain Sarg and IS NULL
> [CALCITE-4173], fix assertion error when RexSimplify generates Sarg with
> single null only
>
> >Reverting the feature for the next release will necessitate further
> actions
> >on their side and it may be counter productive.
>
> Reverting the feature != reverting the feature for the next release.
> It might be the feature will be ready by the next release.
>
> I just don't want to keep the code and tests broken.
> There are lots of contributions landing to RexSimplify (which is nice, and
> I love optimizations),
> however, we have disabled RexProgramFuzzyTest long ago, and we really need
> to re-enable it
> otherwise, we would keep introducing regressions.
>
> Vladimir
>

Re: How to parse sql of DDL

2020-11-04 Thread Danny Chan

You can try the sql parser of Apache Flink, which supports the "ALTER TABLE
... ADD COLUMN ..." syntax. [1]

[1]
https://github.com/apache/flink/blob/master/flink-table/flink-sql-parser/src/test/java/org/apache/flink/sql/parser/FlinkSqlParserImplTest.java

xiao cai  于2020年11月4日周三 下午4:25写道：

> Hi Liya:
>
>
> Althrough that can not meet my needs，but it still helps me a lot, thank
> you very much for your reply.
>
>
> Best,
> Xiao.
>
>
>  原始邮件
> 发件人: Fan Liya
> 收件人: dev
> 发送时间: 2020年11月4日(周三) 11:35
> 主题: Re: How to parse sql of DDL
>
>
> Hi Xiao, I am not sure if we support "alter table ... add column ... after
> ..." statement, as some databases do not support this syntax. We do support
> "alter table ... add column ..." statement. An example can be found in
> MultiJdbcSchemaJoinTest#testSchemaConsistency, which may be helpful. Best,
> Liya Fan On Wed, Nov 4, 2020 at 10:53 AM xiao cai 
> wrote: > Hi Liya： > Thanks for your suggestion！ > > > Here is the demo i
> ran using SqlParser: > > > -- > String sql = "ALTER TABLE
> `test`.`mysql_sink_test_1` ADD COLUMN > `col_for_test` varchar(255) NULL
> AFTER `col_binary_1`; > SqlParser.Config mysqlConfig = >
> SqlParser.configBuilder().setLex(Lex.MYSQL).build(); > SqlParser sqlParser
> = SqlParser.create(sql.toLowerCase(), mysqlConfig); > SqlNode sqlNode =
> sqlParser.parseStmt(); > -- > > > Then I received the following
> error message： > > > - >
> org.apache.calcite.sql.parser.SqlParseException: Encountered "table" at >
> line 1, column 7. > Was expecting one of: > "SESSION" ... > "SYSTEM" ... >
> at >
> org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:364)
> > at >
> org.apache.calcite.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:149)
> > at >
> org.apache.calcite.sql.parser.SqlParser.handleException(SqlParser.java:148)
> > at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:163)
> > at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:188)
> > ——— > > > > > > > 原始邮件 > 发件人: Fan Liya >
> 收件人: dev > 发送时间: 2020年11月4日(周三) 10:40 > 主题: Re:
> How to parse sql of DDL > > > Hi Xiao, It's hard to diagnose the problem
> without seeing your code. My > suggestion is that you can debug some test
> cases in our code base (related > to SqlParser), and check how it works.
> Best, Liya Fan On Wed, Nov 4, 2020 > at 10:34 AM xiao cai 
> wrote: > Hi > I just want to konw > how to parse the sql statement of DDL.
> I had use > SqlParser to do this, > but it was failed. > Is someone can
> help me , thank you very mush. > > > > Best xiao.

Re: [DISCUSS] Support Oracle Polymorphic Table Function Syntax

2020-11-02 Thread Danny Chan

In current codebase, we actually never allows syntax like

SELECT *
  FROM TT

because of 2 reasons:
1. The parser does not allow table function calls without parenthesis
2. SqlConformace.allowNiladicParentheses default returns false

In TableFunctionTest, I try to register the same name table and table
function in the same schema, and there is no ambiguity.
So I guess there is no problem here ?


Julian Hyde  于2020年11月3日周二 上午3:02写道：

> Something like that. I guess if TT is a function with zero parameters
> and also a table, then does TT in the following refer to the function
> or the table?
>
>   SELECT *
>   FROM TT
>
> Note that the TT function doesn't even have to be a table function.
> Table functions are functions, and so are in the same namespace.
>
> Furthermore, we look up tables from the current schema, whereas we
> look up functions (and table functions) from the path.
>
> Julian
>
> On Mon, Nov 2, 2020 at 3:55 AM Danny Chan  wrote:
> >
> > Thanks Julian ~
> >
> > > Possibly, but I'd be cautious, because the semantics become ambiguous
> > if there are name clashes.
> >
> > Assumes there is a table named TT and a table function named TT(param_a,
> > param_b), they are under the same namespace, do you mean
> >
> > "FROM TT(param_a, param_b)" or "FROM TT(TABLE TT, param_b)"
> >
> > become ambiguous ?
> >
> >
> > Julian Hyde  于2020年10月31日周六 上午3:19写道：
> >
> > > > CALCITE-1490 suggests only add to SQL server, but as Oracle,
> Snowflake
> > > and
> > > > BigQuery seems all support the simplified syntax, it might be enough
> to
> > > > justify adding this support to default syntax?
> > >
> > > Possibly, but I'd be cautious, because the semantics become ambiguous
> > > if there are name clashes.
> > >
> > > We should document what should be the behavior if there are name
> > > clashes, and also research what SQL Server, Oracle, Snowflake and
> > > BigQuery do.
> > >
> > > On Fri, Oct 30, 2020 at 10:55 AM Rui Wang 
> wrote:
> > > >
> > > > >Are we planning to support it as a default syntax or as a dialect ?
> Say,
> > > > >maybe Oracle.
> > > >
> > > > CALCITE-1490 suggests only add to SQL server, but as Oracle,
> Snowflake
> > > and
> > > > BigQuery seems all support the simplified syntax, it might be enough
> to
> > > > justify adding this support to default syntax?
> > > >
> > > > >Another idea is that maybe we can use the parentheses to distinguish
> > > > >whether
> > > > >this is a table or table-function and use different namespace for
> > > > >validation?
> > > >
> > > > The idea sounds good. The implementation might become complicated.
> > > > Namespaces are only different when looking up a table function or a
> > > table,
> > > > others have to be the same.
> > > >
> > > >
> > > > -Rui
> > > >
> > > >
> > > > On Thu, Oct 29, 2020 at 9:11 PM Jark Wu  wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Yes, there are two separate discussions here.
> > > > > 1) omit TABLE() keyword for table function calls, i.e. CALCITE-1490
> > > > > 2) omit TABLE keyword for the table parameter in TVF.
> > > > >
> > > > > Let's focus on the first discussion. If I understand correctly, the
> > > problem
> > > > > is
> > > > >  the namespace conflict of table and table-function.
> > > > >
> > > > > I have tested table-function features in SQL Server (as following
> > > shows),
> > > > > it seems that the
> > > > > parentheses are required to invoke a parameterless table-function.
> > > > > There is a similar question in Stackoverflow[1].
> > > > >
> > > > > > CREATE FUNCTION udfProductInYear ()
> > > > >   RETURNS TABLE
> > > > >   AS
> > > > >   RETURN
> > > > > SELECT * FROM [dbo].[TEST];
> > > > >
> > > > > > select * from udfProductInYear;
> > > > > Parameters were not supplied for the function 'udfProductInYear'.
> > > > >
> > > > > > select * from udfProductInYear();
> > > > > This works.
> > > > >
> > > > > > create table udfPro

Re: [DISCUSS] Support Oracle Polymorphic Table Function Syntax

2020-11-02 Thread Danny Chan

Thanks Julian ~

> Possibly, but I'd be cautious, because the semantics become ambiguous
if there are name clashes.

Assumes there is a table named TT and a table function named TT(param_a,
param_b), they are under the same namespace, do you mean

"FROM TT(param_a, param_b)" or "FROM TT(TABLE TT, param_b)"

become ambiguous ?


Julian Hyde  于2020年10月31日周六 上午3:19写道：

> > CALCITE-1490 suggests only add to SQL server, but as Oracle, Snowflake
> and
> > BigQuery seems all support the simplified syntax, it might be enough to
> > justify adding this support to default syntax?
>
> Possibly, but I'd be cautious, because the semantics become ambiguous
> if there are name clashes.
>
> We should document what should be the behavior if there are name
> clashes, and also research what SQL Server, Oracle, Snowflake and
> BigQuery do.
>
> On Fri, Oct 30, 2020 at 10:55 AM Rui Wang  wrote:
> >
> > >Are we planning to support it as a default syntax or as a dialect ? Say,
> > >maybe Oracle.
> >
> > CALCITE-1490 suggests only add to SQL server, but as Oracle, Snowflake
> and
> > BigQuery seems all support the simplified syntax, it might be enough to
> > justify adding this support to default syntax?
> >
> > >Another idea is that maybe we can use the parentheses to distinguish
> > >whether
> > >this is a table or table-function and use different namespace for
> > >validation?
> >
> > The idea sounds good. The implementation might become complicated.
> > Namespaces are only different when looking up a table function or a
> table,
> > others have to be the same.
> >
> >
> > -Rui
> >
> >
> > On Thu, Oct 29, 2020 at 9:11 PM Jark Wu  wrote:
> >
> > > Hi all,
> > >
> > > Yes, there are two separate discussions here.
> > > 1) omit TABLE() keyword for table function calls, i.e. CALCITE-1490
> > > 2) omit TABLE keyword for the table parameter in TVF.
> > >
> > > Let's focus on the first discussion. If I understand correctly, the
> problem
> > > is
> > >  the namespace conflict of table and table-function.
> > >
> > > I have tested table-function features in SQL Server (as following
> shows),
> > > it seems that the
> > > parentheses are required to invoke a parameterless table-function.
> > > There is a similar question in Stackoverflow[1].
> > >
> > > > CREATE FUNCTION udfProductInYear ()
> > >   RETURNS TABLE
> > >   AS
> > >   RETURN
> > > SELECT * FROM [dbo].[TEST];
> > >
> > > > select * from udfProductInYear;
> > > Parameters were not supplied for the function 'udfProductInYear'.
> > >
> > > > select * from udfProductInYear();
> > > This works.
> > >
> > > > create table udfProductInYear(a int, b VARCHAR);
> > > There is already an object named 'udfProductInYear' in the database.
> > >
> > > From the above error messages, we can see that SQL Server shares the
> same
> > > namespace for table and table-function.
> > > So I think we can do the same thing but only enabled via a new method
> in
> > > SqlConformance.
> > >
> > > Another idea is that maybe we can use the parentheses to distinguish
> > > whether
> > > this is a table or table-function and use different namespace for
> > > validation?
> > >
> > > Best,
> > > Jark
> > >
> > > [1]: https://stackoverflow.com/a/21022682/4915129
> > >
> > >
> > > On Fri, 30 Oct 2020 at 09:48, Danny Chan  wrote:
> > >
> > > > > Let's not use TUMBLE (or HOP, or SESSION) as the main example. It
> is
> > > > somewhat built-in (i.e. has special treatment in the code). Let's
> work
> > > > in terms of, say, the RAMP user-defined function. It is used in
> > > > several tests [1].
> > > >
> > > > We may need to support all the user defined table functions with
> TABLE
> > > > keyword left out.
> > > >
> > > > Are we planning to support it as a default syntax or as a dialect ?
> Say,
> > > > maybe Oracle.
> > > >
> > > > Julian Hyde  于2020年10月30日周五 上午5:11写道：
> > > >
> > > > > I think we can do them separately. And I think we should do the
> TABLE
> > > > > first. The biggest problem is with namespaces - if you omit TABLE,
> you
> > > > > have to deal with the possibility that there is a table (or view

Re: [DISCUSS] Screenshots of stack traces, code, and debugger sessions

2020-10-29 Thread Danny Chan

Yes, +1 to handle it case by case with proper reminders.

Rui Wang  于2020年10月30日周五 上午6:19写道：

> >People who write to dev@ list and attach the stacktrace as an image don't
> >really know if the list accepts images or not.
> >Making images forbidden does not help to teach people to
> >attach stacktraces as text.
>
> I guess the way will be when we see those cases, we reply to ask to replace
> the image by text.  This is not perfect :( but I guess that is what will
> happen.
>
>
> -Rui
>
> On Thu, Oct 29, 2020 at 4:19 AM Vladimir Sitnikov <
> sitnikov.vladi...@gmail.com> wrote:
>
> > To all who vote for "disallow images",
> > Do you think it really helps if the images are forbidden?
> >
> > People who write to dev@ list and attach the stacktrace as an image
> don't
> > really know if the list accepts images or not.
> > Making images forbidden does not help to teach people to
> > attach stacktraces as text.
> >
> > We don't want to use MS Word with a screenshot inside to share a
> > stacktrace.
> > I believe that is obvious. We don't need a discussion about that, do we?
> >
> > ---
> >
> > How are you going to share diagrams (e.g. GraphViz-based charts)?
> > How are you going to share memory leak suspects?
> >
> > There are cases when a screenshot makes a lot of sense.
> > For instance, memory dump can easily contain sensitive information, so
> > memory dump can't be shared.
> > However, a screenshot might be helpful, so a screenshot with leak
> suspects
> > would be more than welcome.
> >
> > Vladimir
> >
>

Re: [DISCUSS] Support Oracle Polymorphic Table Function Syntax

2020-10-29 Thread Danny Chan

> Let's not use TUMBLE (or HOP, or SESSION) as the main example. It is
somewhat built-in (i.e. has special treatment in the code). Let's work
in terms of, say, the RAMP user-defined function. It is used in
several tests [1].

We may need to support all the user defined table functions with TABLE
keyword left out.

Are we planning to support it as a default syntax or as a dialect ? Say,
maybe Oracle.

Julian Hyde  于2020年10月30日周五 上午5:11写道：

> I think we can do them separately. And I think we should do the TABLE
> first. The biggest problem is with namespaces - if you omit TABLE, you
> have to deal with the possibility that there is a table (or view)
> called FOO and also a parameterless table function called FOO. Not
> sure how Oracle and SQL Server resolve this.
>
> Let's not use TUMBLE (or HOP, or SESSION) as the main example. It is
> somewhat built-in (i.e. has special treatment in the code). Let's work
> in terms of, say, the RAMP user-defined function. It is used in
> several tests [1].
>
> Julian
>
> [1]
> https://github.com/apache/calcite/blob/ffc1e3b05e7f920d95c48f7c75fd48372684b8e7/core/src/test/java/org/apache/calcite/test/RelBuilderTest.java#L361
>
> On Thu, Oct 29, 2020 at 1:53 PM Rui Wang  wrote:
> >
> > In terms of SQL grammar to support omitting TABLE, there are actually two
> > changes for
> >
> > SELECT *
> > FROM TABLE(TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10'
> MINUTES));
> >
> > You can support omitting the TABLE after the FROM, which makes the query
> > become:
> > SELECT *
> > FROM TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES);
> >
> > Another is omitting TABLE from the parameter, which is
> > SELECT *
> > FROM TUMBLE(Bid, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES);
> >
> > Do we want to achieve both or just one of these? If for one the grammar
> is
> > too complicated to change but for another the grammar is easier to be
> > changed, are we ok to only have one keyword omitted?
> >
> > -Rui
> >
> > On Thu, Oct 29, 2020 at 11:28 AM Julian Hyde  wrote:
> >
> > > Can we drop the word "polymorphic" from the discussion? Polymorphic
> > > table functions are a valid ask, but can be a separate discussion.
> > > This is about calling table functions without the TABLE keyword,
> > > right?
> > >
> > > Which is what I said to you four years ago:
> > > https://issues.apache.org/jira/browse/CALCITE-1472#comment-15664799
> > >
> > > In other words: let's fix
> > > https://issues.apache.org/jira/browse/CALCITE-1490.
> > >
> > > Julian
> > >
> > > On Thu, Oct 29, 2020 at 2:26 AM Jark Wu  wrote:
> > > >
> > > > Hi all,
> > > >
> > > > It's great to see Calcite already supports PTF syntax and support the
> > > > out-of-box new window syntax.
> > > >
> > > > SELECT *
> > > > FROM TABLE(TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10'
> > > MINUTES));
> > > >
> > > > However, some people from the Flink community think that the TABLE()
> > > > keyword is a little verbose for users [5].
> > > > I have seen the discussion in the previous mailing list [1], and
> know the
> > > > TABLE() keyword is mandatory in SQL standard paper.
> > > >
> > > > But it seems that other databases are not following the standard, I
> find
> > > > that Polymorphic Table Function syntax
> > > >  in Oracle is more concise without the TABLE() keywords [2][3], e.g.
> > > >
> > > > SELECT *
> > > > FROM skip_col(scott.emp, COLUMNS(comm, hiredate, mgr))
> > > >
> > > > Besides, SQL Server also seems to support the non TABLE() syntax for
> > > > table-valued functions [4].
> > > > If we can support the Oracle syntax, it would be more consistent with
> > > > existing TUMBLE functions
> > > > and hence easier to be picked up by users.
> > > >
> > > > The new window TVF syntax can be:
> > > >
> > > > SELECT window_start, count(*)
> > > > FROM TUMBLE(Bid, COLUMNS(bidtime), INTERVAL '10' MINUTES))
> > > > GROUP BY window_start;
> > > >
> > > > Which is more similar to the existing group window functions:
> > > >
> > > > SELECT TUMBLE_START(bidtime, INTERVAL '10' MINUTES), count(*)
> > > > FROM Bid
> > > > GROUP BY TUMBLE(bidtime, INTERVAL '10' MINUTES);
> > > >
> > > > I am fairly inexperienced with the parsing and validation logic in
> > > Calcite,
> > > > so I don't know whether the new syntax can be supported in Calcite.
> > > >
> > > > What do you think?
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > [1]:
> > > >
> > >
> https://lists.apache.org/x/thread.html/4a91632b1c780ef9d67311f90fce626582faae7d30a134a768c3d324@%3Cdev.calcite.apache.org%3E
> > > > [2]:
> > > https://oracle-base.com/articles/18c/polymorphic-table-functions-18c
> > > > [3]:
> > > >
> > >
> https://docs.oracle.com/en/database/oracle/oracle-database/18/lnpls/plsql-optimization-and-tuning.html#GUID-F8E13DDD-710D-4626-824E-B344849C5AFE
> > > > [4]:
> > > >
> > >
> https://www.sqlservertutorial.net/sql-server-user-defined-functions/sql-server-table-valued-functions/
> > > > [5]:
> > > >
> > >
>

Re: Re: how to register udfa to calcite?

2020-10-27 Thread Danny Chan

Your "aggr" needs to implement the interface SqlAggFunction
<https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/SqlAggFunction.java>
.

dawangli  于2020年10月27日周二 下午3:24写道：

> sorry, i means udaf
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> At 2020-10-27 15:21:36, "Danny Chan"  wrote:
> >Do you mean DUTF or UDAF ?
> >
> >dawangli  于2020年10月27日周二 上午11:35写道：
> >
> >> I hive the following code，it just valid a simple group by sql with a
> udaf,
> >> but i doesn't work, it seems like the calcite cannot identify the aggr
> udaf
> >> ,code and error displayed below
> >> code
> >> String sql = "SELECT\n" +
> >> "  dim,\n" +
> >> "  aggr(val)\n" +
> >> "FROM\n" +
> >> "  tb\n" +
> >> "GROUP BY\n" +
> >> "  dim";
> >>
> >> CalciteSchema rootSchema = CalciteSchema
> >> .createRootSchema(false, false);
> >>
> >> rootSchema.add("tb", new AbstractTable() { //note: add a table
> >> @Override
> >> public RelDataType getRowType(final RelDataTypeFactory typeFactory) {
> >> RelDataTypeFactory.Builder builder = typeFactory.builder();
> >>
> >> builder.add("dim", new BasicSqlType(new RelDataTypeSystemImpl() {},
> >> SqlTypeName.VARCHAR));
> >> builder.add("val", new BasicSqlType(new RelDataTypeSystemImpl() {},
> >> SqlTypeName.BIGINT));
> >> return builder.build();
> >> }
> >> });
> >>
> >> SqlParser.ConfigBuilder builder = SqlParser.configBuilder();
> >> builder.setQuotedCasing(Casing.TO_UPPER);
> >> builder.setUnquotedCasing(Casing.TO_UPPER);
> >> builder.setCaseSensitive(false);
> >> builder.setLex(Lex.JAVA);
> >> SqlStdOperatorTable sqlStdOperatorTable =
> SqlStdOperatorTable.instance();
> >>
> >> sqlStdOperatorTable.register(new SqlFunction(
> >> new SqlIdentifier("aggr", SqlParserPos.ZERO),
> >> ReturnTypes.cascade(ReturnTypes.explicit(SqlTypeName.ANY),
> >> SqlTypeTransforms.TO_NULLABLE),
> >> null,
> >> OperandTypes.family(SqlTypeFamily.ANY),
> >> Lists.newArrayList(new
> >>
> SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT).createSqlType(SqlTypeName.ANY)),
> >> SqlFunctionCategory.USER_DEFINED_FUNCTION));
> >>
> >> final FrameworkConfig config = Frameworks.newConfigBuilder()
> >> .defaultSchema(rootSchema.plus())
> >> .parserConfig(builder.build())
> >> .operatorTable(sqlStdOperatorTable)
> >> .build();
> >> Planner planner = Frameworks.getPlanner(config);
> >>
> >> SqlNode originSqlNode = planner.parse(sql);
> >>
> >> SqlNode sqlNode = planner.validate(originSqlNode);
> >>
> >>
> >> error:
> >>
> >>
> >> Exception in thread "main" org.apache.calcite.tools.ValidationException:
> >> org.apache.calcite.runtime.CalciteContextException: From line 3, column
> 8
> >> to line 3, column 10: Expression 'val' is not being grouped
> >> at
> >> org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:217)
> >> at com.kuaishou.dp.stream.metadata.sql.Test.main(Test.java:422)
> >> Caused by: org.apache.calcite.runtime.CalciteContextException: From line
> >> 3, column 8 to line 3, column 10: Expression 'val' is not being grouped
> >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> >> Method)
> >> at
> >>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> >> at
> >>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> >> at
> java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> >> at
> >>
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
> >> at
> >> org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:834)
> >> at
> >> org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:819)
> >> at
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4867)
> >> at
> >> org.apache.calcite.sql.validate.AggChecker.visit(AggC

Re: how to register udfa to calcite?

2020-10-27 Thread Danny Chan

Do you mean DUTF or UDAF ?

dawangli  于2020年10月27日周二 上午11:35写道：

> I hive the following code，it just valid a simple group by sql with a udaf,
> but i doesn't work, it seems like the calcite cannot identify the aggr udaf
> ,code and error displayed below
> code
> String sql = "SELECT\n" +
> "  dim,\n" +
> "  aggr(val)\n" +
> "FROM\n" +
> "  tb\n" +
> "GROUP BY\n" +
> "  dim";
>
> CalciteSchema rootSchema = CalciteSchema
> .createRootSchema(false, false);
>
> rootSchema.add("tb", new AbstractTable() { //note: add a table
> @Override
> public RelDataType getRowType(final RelDataTypeFactory typeFactory) {
> RelDataTypeFactory.Builder builder = typeFactory.builder();
>
> builder.add("dim", new BasicSqlType(new RelDataTypeSystemImpl() {},
> SqlTypeName.VARCHAR));
> builder.add("val", new BasicSqlType(new RelDataTypeSystemImpl() {},
> SqlTypeName.BIGINT));
> return builder.build();
> }
> });
>
> SqlParser.ConfigBuilder builder = SqlParser.configBuilder();
> builder.setQuotedCasing(Casing.TO_UPPER);
> builder.setUnquotedCasing(Casing.TO_UPPER);
> builder.setCaseSensitive(false);
> builder.setLex(Lex.JAVA);
> SqlStdOperatorTable sqlStdOperatorTable = SqlStdOperatorTable.instance();
>
> sqlStdOperatorTable.register(new SqlFunction(
> new SqlIdentifier("aggr", SqlParserPos.ZERO),
> ReturnTypes.cascade(ReturnTypes.explicit(SqlTypeName.ANY),
> SqlTypeTransforms.TO_NULLABLE),
> null,
> OperandTypes.family(SqlTypeFamily.ANY),
> Lists.newArrayList(new
> SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT).createSqlType(SqlTypeName.ANY)),
> SqlFunctionCategory.USER_DEFINED_FUNCTION));
>
> final FrameworkConfig config = Frameworks.newConfigBuilder()
> .defaultSchema(rootSchema.plus())
> .parserConfig(builder.build())
> .operatorTable(sqlStdOperatorTable)
> .build();
> Planner planner = Frameworks.getPlanner(config);
>
> SqlNode originSqlNode = planner.parse(sql);
>
> SqlNode sqlNode = planner.validate(originSqlNode);
>
>
> error:
>
>
> Exception in thread "main" org.apache.calcite.tools.ValidationException:
> org.apache.calcite.runtime.CalciteContextException: From line 3, column 8
> to line 3, column 10: Expression 'val' is not being grouped
> at
> org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:217)
> at com.kuaishou.dp.stream.metadata.sql.Test.main(Test.java:422)
> Caused by: org.apache.calcite.runtime.CalciteContextException: From line
> 3, column 8 to line 3, column 10: Expression 'val' is not being grouped
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
> at
> org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:834)
> at
> org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:819)
> at
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4867)
> at
> org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:113)
> at
> org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
> at
> org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:317)
> at
> org.apache.calcite.sql.util.SqlBasicVisitor$ArgHandlerImpl.visitChild(SqlBasicVisitor.java:123)
> at
> org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:868)
> at
> org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:212)
> at
> org.apache.calcite.sql.validate.AggChecker.visit(AggChecker.java:40)
> at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
> at
> org.apache.calcite.sql.validate.AggregatingSelectScope.checkAggregateExpr(AggregatingSelectScope.java:228)
> at
> org.apache.calcite.sql.validate.AggregatingSelectScope.validateExpr(AggregatingSelectScope.java:237)
> at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateExpr(SqlValidatorImpl.java:4162)
> at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelectList(SqlValidatorImpl.java:4136)
> at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3392)
> at
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
> at
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
> at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1005)
> at
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:965)
> at

Re: [DISCUSS] Some quesitons about the new introduced window table functions (TUMBLE, HOP and SESSION)

2020-10-25 Thread Danny Chan

Thanks for the sharing, after some research we found that many DB engines
has poor support for nested row type,
for example, assumes:
my_type is a UDT with schema as (a: int, b: varchar(20))
table T has schema as (f0: my_type, f1: int, f2: varchar(20))

PostgreSQL:
select f0.a, max(f1) from T group by f0 -- fails because it recognize the
"f0.a" f0 as a table name
select (f0).a, max(f1) from T group by f0 -- succeed

Oracle:
select f0.a, max(f1) from T group by f0 -- fails because group by UDT is
not supported

In Calcite, there is a fact that the nested row optimize rules are poor
supported, such as the project prune, transpose.
The nested row nullability is also a topic that needs to be clarified.
Currently, when a nested type field is nullable, all it's fields are
nullable too,
that is not true for Java POJO: User(id: int, name: string) (e.g. its
fields have un-boxed type)


Pengcheng Liu  于2020年10月23日周五 下午1:46写道：

> Hi, Danny,
>I think there is already an explanation in [1], as the author noted:
> > The return value of Tumble is a relation that includes all columns of
> data as well as
> > additional event time columns wstart and wend. It was considered to place
> the original
> > row in a nested row and wstart and wend in a separate nested row, for
> simplicity of
> > namespacing, but that would limit these extensions to engines supporting
> nested rows.
>
> [1]
>
> https://docs.google.com/document/d/138uA7VTpbF84CFrd--cz3YVe0-AQ9ALnsavaSE2JeE4
>
>
> Danny Chan  于2020年10月22日周四 下午7:40写道：
>
> > Rui Wang, can you help to consult with the author of the window TVF
> syntax
> > ? We want to hear more suggestions.
> >
> > cc @Rui Wang
> >
> > Rui Wang  于2020年10月22日周四 下午12:01写道：
> >
> > > This seems reasonable to merge window metadata as a column (more
> compact
> > > format).
> > >
> > > Maybe Julian can comment whether there was special consideration to
> > define
> > > window_start and window_end as two separate columns?
> > >
> > >
> > > -Rui
> > >
> > > On Wed, Oct 21, 2020 at 8:20 PM Danny Chan 
> wrote:
> > >
> > > > Some of our fellows suggest the window table value functions to
> append
> > a
> > > > structure type field:
> > > >
> > > > window: ROW(start: timestamp(3), end: timestamp(3))
> > > >
> > > > For example if
> > > >
> > > > table T has schema: (ts: timestamp(3), b: varchar(20)), then
> > > >
> > > > Tumble(TABLE T, DESCRIPTOR(T.ts), INTERVAL '5' MINUTE)
> > > >
> > > > has return type
> > > >
> > > > Row(ts timestamp(3), b varchar(20), window Row(start timestamp(3),
> end
> > > > timestamp(3)))
> > > >
> > > > Thus in the aggregate key or the join condition, we can compare the
> > > > structure “window” field directly, e.g.
> > > >
> > > > GROUP BY T.window or ON L.window = R.window
> > > >
> > > > What do you think about this ?
> > > >
> > > > Rui Wang  于2020年6月9日周二 上午1:34写道：
> > > >
> > > > > On Mon, Jun 8, 2020 at 12:04 AM Viliam Durina <
> vil...@hazelcast.com>
> > > > > wrote:
> > > > >
> > > > > > > 2. rename the hop_size to emit_frequency. Because that's what
> > this
> > > > > > argument tries to say: e.g. emit in every x time_unit.
> > > > > >
> > > > > > It's not about how often you emit. By "emit" I mean the moment
> when
> > > the
> > > > > > rows are sent out. For example, when working with event time and
> > the
> > > > > input
> > > > > > stalls, you might emit much later. Or you can use it for
> historical
> > > > data
> > > > > in
> > > > > > which case the actual interval will be much shorter. It's the
> time
> > > > > interval
> > > > > > by which the window "hops". Two subsequent windows will be this
> > much
> > > > > apart.
> > > > > >
> > > > > > Regarding the DESCRIPTOR, the PTF is supposed to be implemented
> by
> > up
> > > > to
> > > > > 4
> > > > > > member functions: describe, start, fulfill, finish. The
> `describe`
> > > > > function
> > > > > > is supposed to be called during validation. It can validate the
> > > > > arguments.
> > > > > > In

Re: [DISCUSS] Some quesitons about the new introduced window table functions (TUMBLE, HOP and SESSION)

2020-10-22 Thread Danny Chan

Rui Wang, can you help to consult with the author of the window TVF syntax
? We want to hear more suggestions.

cc @Rui Wang

Rui Wang  于2020年10月22日周四 下午12:01写道：

> This seems reasonable to merge window metadata as a column (more compact
> format).
>
> Maybe Julian can comment whether there was special consideration to define
> window_start and window_end as two separate columns?
>
>
> -Rui
>
> On Wed, Oct 21, 2020 at 8:20 PM Danny Chan  wrote:
>
> > Some of our fellows suggest the window table value functions to append a
> > structure type field:
> >
> > window: ROW(start: timestamp(3), end: timestamp(3))
> >
> > For example if
> >
> > table T has schema: (ts: timestamp(3), b: varchar(20)), then
> >
> > Tumble(TABLE T, DESCRIPTOR(T.ts), INTERVAL '5' MINUTE)
> >
> > has return type
> >
> > Row(ts timestamp(3), b varchar(20), window Row(start timestamp(3), end
> > timestamp(3)))
> >
> > Thus in the aggregate key or the join condition, we can compare the
> > structure “window” field directly, e.g.
> >
> > GROUP BY T.window or ON L.window = R.window
> >
> > What do you think about this ?
> >
> > Rui Wang  于2020年6月9日周二 上午1:34写道：
> >
> > > On Mon, Jun 8, 2020 at 12:04 AM Viliam Durina 
> > > wrote:
> > >
> > > > > 2. rename the hop_size to emit_frequency. Because that's what this
> > > > argument tries to say: e.g. emit in every x time_unit.
> > > >
> > > > It's not about how often you emit. By "emit" I mean the moment when
> the
> > > > rows are sent out. For example, when working with event time and the
> > > input
> > > > stalls, you might emit much later. Or you can use it for historical
> > data
> > > in
> > > > which case the actual interval will be much shorter. It's the time
> > > interval
> > > > by which the window "hops". Two subsequent windows will be this much
> > > apart.
> > > >
> > > > Regarding the DESCRIPTOR, the PTF is supposed to be implemented by up
> > to
> > > 4
> > > > member functions: describe, start, fulfill, finish. The `describe`
> > > function
> > > > is supposed to be called during validation. It can validate the
> > > arguments.
> > > > In case of TUMBLE, it can check whether the column described by the
> > > > descriptor is present in the input table and if it is of a correct
> > type.
> > > It
> > > > can throw errors which will become validation errors.
> > >
> > > Yes. Descriptor has a feature to enable a type checking. If considering
> > it,
> > > functions will be (argument name not decided yet):
> > >
> > > TUMBLE(
> > > data TABLE,
> > > time_column DESCRIPTOR(*TIMESTAMP*),
> > > window_size INTERVAL DAY TO SECOND,
> > > offset INTERVAL DAY TO SECOND DEFAULT INTERVAL '0' SECONDS)
> > >
> > > HOP(
> > > data TABLE,
> > > time_column DESCRIPTOR(*TIMESTAMP*),
> > > window_size INTERVAL DAY TO SECOND,
> > > emit_frequency INTERVAL DAY TO SECOND,
> > > offset INTERVAL DAY TO SECOND DEFAULT INTERVAL '0' SECONDS)
> > >
> > >
> > >
> > >
> > > > It also determines
> > > > the actual output table type.
> > > >
> > > > I think it's worthwhile to read the PTF spec before implementing
> this,
> > > it's
> > > > freely available here:
> > > >
> > > >
> > >
> >
> https://standards.iso.org/ittf/PubliclyAvailableStandards/c069776_ISO_IEC_TR_19075-7_2017.zip
> > > >
> > > > Viliam
> > > >
> > > > On Sun, 7 Jun 2020 at 06:23, Rui Wang  wrote:
> > > >
> > > > > Thanks Danny for your summary!
> > > > >
> > > > > For question one, I prefer the ordering of parameter in option two.
> > > > > However, for argument names, I think it might be better to
> > > > > 1. name the window size explicitly by "window_size".
> > > > > 2. rename the hop_size to emit_frequency. Because that's what this
> > > > > argument tries to say: e.g. emit in every x time_unit.
> > > > > 3. don't use "table" as the first argument name, because "table"
> is a
> > > > > keyword. So we can replace it by "data"
> > > > >
> > > > > so it might be
> > > > > T

Re: [DISCUSS] Apache Calcite Online Meetup January 2021

2020-10-22 Thread Danny Chan

+1 for the online meeting, nice idea ~

Best,
Danny Chan
在 2020年10月22日 +0800 PM5:58，JiaTao Tao ，写道：
> I think it's a nice idea, hope to hear Julian's topic about lattice/mv, we
> have a big demand on this and want to learn more.
>
>
> Regards!
>
> Aron Tao
>
>
> Julian Hyde  于2020年10月16日周五 下午12:50写道：
>
> > Nice idea, Stamatis. I'm happy to give a talk if people want to hear
> > it - possible topics include geospatial in Calcite, quick intro to
> > RelBuilder, and lattices/materialized views.
> >
> > On Thu, Oct 15, 2020 at 10:35 AM Rui Wang  wrote:
> > >
> > > I can give a small talk (10 to 20 mins) about changes of Calcite to
> > support
> > > streaming.
> > >
> > >
> > > -Rui
> > >
> > > On Thu, Oct 15, 2020 at 7:19 AM Alessandro Solimando <
> > > alessandro.solima...@gmail.com> wrote:
> > >
> > > > Hi Stamatis,
> > > > that's a very nice idea, I am definitely interested in attending a
> > meetup
> > > > and hearing presentations around Calcite.
> > > >
> > > > Best regards,
> > > > Alessandro
> > > >
> > > > On Thu, 15 Oct 2020 at 15:09, Stamatis Zampetakis 
> > > > wrote:
> > > >
> > > > > Hi guys,
> > > > >
> > > > > How do you feel about participating in an online meetup for Calcite
> > in
> > > > > January 2021?
> > > > >
> > > > > It's been forever since we organised a Calcite meetup and with the
> > > > current
> > > > > situation (pandemic) it will be almost impossible to hold one in
> > person
> > > > in
> > > > > the near future.
> > > > >
> > > > > Nevertheless, people now more than ever tend to organise and attend
> > such
> > > > > events so we could take advantage of the situation to exchange ideas,
> > > > learn
> > > > > new things, and meet each other in a more interactive way.
> > > > >
> > > > > We can have a few presentations and slots for open discussion among
> > us.
> > > > >
> > > > > The subject can be anything from something very technical to a very
> > > > simple
> > > > > presentation of an adapter, the use of calcite in a university
> > course, in
> > > > > academia, or whatever else you can imagine.
> > > > >
> > > > > The duration can be also flexible, ranging from a few minutes to an
> > hour
> > > > or
> > > > > more depending on the courage of the speaker.
> > > > >
> > > > > I'm sure most of us have some slides here and there that could feed a
> > > > small
> > > > > presentation around Calcite so let's make this happen!
> > > > >
> > > > > So are there people interested to give a talk around Calcite?
> > > > >
> > > > > Personally, I can give a presentation on very basic stuff on query
> > > > > optimization and Calcite showing a bit of motivation and going
> > through
> > > > the
> > > > > main APIs with examples and code samples.
> > > > >
> > > > > Apart from speakers, are there people interested in attending such
> > > > meetups?
> > > > >
> > > > > We can fix the date and time based on the region of the majority.
> > > > >
> > > > > Best,
> > > > > Stamatis
> > > > >
> > > >
> >

Re: RelBuilder#project should try keep alias in optimizer

2020-10-21 Thread Danny Chan

During planning, Calcite only optimize based on the field input refs, if we 
always keep the alias, there are many cases that can not be promoted. I 
collected some cases that already tracked by the JIRA issue:

[1] https://issues.apache.org/jira/browse/CALCITE-4037
[2] https://issues.apache.org/jira/browse/CALCITE-3662
[3] https://issues.apache.org/jira/browse/CALCITE-1584

Best,
Danny Chan
在 2020年10月21日 +0800 PM2:49，JiaTao Tao ，写道：
> Hi fellows
> I found in some place, we just call `project(Iterable
> nodes) `, but this will lose alias, why don't we try to call
> `project(Iterable extends RexNode> nodes, Iterable fieldNames) ` to keep alias?
>
> Regards!
>
> Aron Tao

Re: [QUESTION] Does SQL standard allows project nested fields for aggregate ?

2020-10-20 Thread Danny Chan

Thanks, it works ~ I didn’t find evidence that SQL standard does not support 
this, at least, the semantic is clear.

Best,
Danny Chan
在 2020年10月21日 +0800 AM7:58，Jinfeng Ni ，写道：
> SELECT f0.a, max(f1) FROM t1 GROUP BY f0
>
> I think the problem with the 1st query is that the parser is confused
> with 'f0' being a table or column name.
>
> PostgrelSQL doc [1] suggests to use () to denote column, and the part
> after () to denote the subfield. In your case, the query should be:
>
> SELECT (f0).a, max(f1) FROM t1 GROUP BY f0;
>
> 1. https://www.postgresql.org/docs/current/rowtypes.html
>
> On Tue, Oct 20, 2020 at 12:17 PM Julian Hyde  wrote:
> >
> > I think that your first query,
> >
> > SELECT f0.a, max(f1) FROM t1 GROUP BY f0
> >
> > should be valid. I don't know whether the SQL standard thinks it
> > should be valid, or whether Calcite can handle it. And I don't know
> > why PostgreSQL has a problem with it.
> >
> > Julian
> >
> > On Mon, Oct 19, 2020 at 9:41 PM Danny Chan  wrote:
> > >
> > > CREATE TYPE my_type AS ( a int , b VARCHAR(20));
> > >
> > > create table t1(
> > > f0 my_type,
> > > f1 int,
> > > f2 varchar(20)
> > > );
> > >
> > > insert into t1 values((1, 'abc'), 2, ‘def’);
> > >
> > > SELECT f0.a, max(f1) FROM t1 GROUP BY f0; — this is invalid in PostgreSQL
> > >
> > > SELECT f0, max(f1) FROM t1 GROUP BY f0; — this is a valid query
> > >
> > >
> > > My question is does SQL standard allows projecting nested fields for 
> > > aggregate ? In current Calcite, it throws and complains that the nested 
> > > field can not be seen in the scope (somehow same with the PG).
> > >
> > > Best,
> > > Danny Chan

[QUESTION] Does SQL standard allows project nested fields for aggregate ?

2020-10-19 Thread Danny Chan

CREATE TYPE my_type AS ( a int , b VARCHAR(20));

create table t1(
  f0 my_type,
  f1 int,
  f2 varchar(20)
);

insert into t1 values((1, 'abc'), 2, ‘def’);

SELECT f0.a, max(f1) FROM t1 GROUP BY f0; — this is invalid in PostgreSQL

SELECT f0, max(f1) FROM t1 GROUP BY f0; — this is a valid query


My question is does SQL standard allows projecting nested fields for aggregate 
? In current Calcite, it throws and complains that the nested field can not be 
seen in the scope (somehow same with the PG).

Best,
Danny Chan

Re: It's my pleasure to introduce myself to you

2020-10-14 Thread Danny Chan

Here are some learning suggestions referenced from Roman ~

IMO the first thing you need to do is to read the whitepaper [1] to
understand the big picture.

Then you can read the Calcite presentation [2] where all main API's are
very well described.

And the last step, which is my main source of knowledge about Calcite,
is to explore the source code of systems that already use Calcite for
query processing. I would recommend Apache Ignite [3], Apache Drill [4],
Hazelcast [5] and many others that you can find here [6].

[1] https://arxiv.org/pdf/1802.10233.pdf
[2]
https://www.slideshare.net/JordanHalterman/introduction-to-apache-calcite
[3] https://github.com/apache/ignite
[4] https://github.com/apache/drill
[5] https://github.com/hazelcast/hazelcast
[6] https://calcite.apache.org/docs/powered_by.html

Best,
Danny Chan
在 2020年10月14日 +0800 PM2:29，guangyuan wang ，写道：
> Hi Danny Chan
> Nice to meet you. I'm very glad to see that the community is so active.
> Thank you very much.
> I am not very familiar with calcite. Do you have any suggestions for the
> newcomer?
> I'd like to fix some bugs in Jira. But it's hard for me to figure out which
> one(bug) is more friendly for me in Jira, could you recommend some to me?
>
> Danny Chan  于2020年10月14日周三 上午11:17写道：
>
> > Welcome Guangyuan Wang ~
> >
> > Very glad that you are interested in Calcite community, hope we can work
> > together to make it better ~
> >
> > You can fire issues, ask/answer questions, basically any contributions are
> > appreciated ~
> >
> > Best,
> > Danny Chan
> > 在 2020年10月14日 +0800 AM8:48，guangyuan wang ，写道：
> > > Hello, everyone.
> > > My name is Guangyuan Wang. I'm a java software engineer and also an
> > Apache
> > > committer. As I am very interested in SQL optimizer, I'd like to join the
> > > calcite community. To contribute to the community is my honor.
> > > Thank you.
> > >
> > >
> > >
> > > Best Wishes
> > >
> > > Guangyuan Wang
> >

Re: It's my pleasure to introduce myself to you

2020-10-13 Thread Danny Chan

Welcome Guangyuan Wang ~

Very glad that you are interested in Calcite community, hope we can work 
together to make it better ~

You can fire issues, ask/answer questions, basically any contributions are 
appreciated ~

Best,
Danny Chan
在 2020年10月14日 +0800 AM8:48，guangyuan wang ，写道：
> Hello, everyone.
> My name is Guangyuan Wang. I'm a java software engineer and also an Apache
> committer. As I am very interested in SQL optimizer, I'd like to join the
> calcite community. To contribute to the community is my honor.
> Thank you.
>
>
>
> Best Wishes
>
> Guangyuan Wang

Re: TableScan#explainTerms may be not enough for digest

2020-10-09 Thread Danny Chan

Thanks for driving this discussion, JiaTao, the Flink way:

Extend the digest through getQualifiedName is hacky somehow.

We actually need some interfaces to reflect/represent the pushed info, the 
pushed fields or filter expression.

Best,
Danny Chan
在 2020年10月9日 +0800 PM3:35，dev@calcite.apache.org，写道：
>
> getQualifiedName

Re: [RESULT] [VOTE] Release apache-calcite-1.26.0 (release candidate 0)

2020-10-06 Thread Danny Chan

Thanks for your work, Ruben! Release is a hard work but you made it easy :)

Ruben Q L 于2020年10月7日 周三上午2:08写道：

> Hi all,
>
> I have rolled out the release, there is still the site that needs to be
> published (working on that), but in the meanwhile I assume it is safe to
> re-open master branch for commits.
> Thanks everyone that participated in this release.
>
> Best,
> Ruben
>
>
>
> On Tue, Oct 6, 2020 at 4:15 PM Ruben Quesada Lopez 
> wrote:
>
> > Thanks to everyone who has tested the release candidate and given their
> > comments and votes.
> >
> > The tally is as follows.
> >
> > 3 binding +1s:
> > Vladimir
> > Danny
> > Ruben
> >
> > 2 non-binding +1s:
> > Andrei
> > Enrico
> >
> > No 0s or -1s.
> >
> > Therefore I am delighted to announce that the proposal to release Apache
> > Calcite 1.26.0 has passed.
> >
> > Thanks everyone. We’ll now roll the release out to the mirrors.
> >
> > Ruben
> >
> >
>

Re: [VOTE] Release apache-calcite-1.26.0 (release candidate 0)

2020-10-05 Thread Danny Chan

+1 (binding).

- run tests locally: ok
- verify the commit hash in git tag: ok
- check sha512: ok

Environment:
- java version 1.8.0_151
- MacOS Mojave 10.14.6

I also find the new introduced ersi library when i do upgrade for Flink. I
solves the problem by adding a dependency for it. It is awesome if we can
avoid it. But it is not a blocker I guess.

Enrico Olivelli 于2020年10月5日 周一下午10:58写道：

> Il giorno lun 5 ott 2020 alle ore 16:43 Ruben Q L  ha
>
> scritto:
>
>
>
> > Enrico,
>
> >
>
> > thanks for your feedback. I also had a similar experience with my
>
> > downstream project.
>
> >
>
> > If I am not mistaken, esri library so far was not necessary because it
> was
>
> > used just on Calcite test; however, as you say, now it is required since
> it
>
> > is used for Calcite build.
>
> > The root cause of this change seems to be CALCITE-1861 [1], implemented
> via
>
> > [2]. I think there is not much we can do about it.
>
> >
>
>
>
> So probably this is the line that introduces that hard dependency
>
>
> https://github.com/apache/calcite/commit/eab043f4ef43112c16a9f6708e6c53a15b1cfbe0#diff-39080ab40de0df7a4df59ca135c79d2fR377
>
>
>
> I will try to find some fix, at least to not fail the loading of Programs
>
> class
>
> I am going to file a JIRA for this case
>
>
>
>
>
> Enrico
>
>
>
>
>
>
>
> >
>
> > Regarding the new SEARCH operator, I also had to adjust my code.
>
> > For me, the easiest way to "keep things working as before" was basically
>
> > adding a "RexUtil#expandSearch" call at the appropriate place in order to
>
> > convert the SEARCH into its equivalent non-search expression.
>
> >
>
> > Best,
>
> > Ruben
>
> >
>
> > [1] https://issues.apache.org/jira/browse/CALCITE-1861
>
> > [2]
>
> >
>
> >
> https://github.com/apache/calcite/commit/eab043f4ef43112c16a9f6708e6c53a15b1cfbe0
>
> >
>
> >
>
> > On Mon, Oct 5, 2020 at 3:17 PM Enrico Olivelli 
>
> > wrote:
>
> >
>
> > > Ruben,
>
> > > I am testing the RC, I found that now it is mandatory to have
>
> > > "com.esri.geometry:esri-geometry-api" on the classpath in order to use
>
> > the
>
> > > Programs class.
>
> > >
>
> > > In HerdDB we were excluding that third party dependency, and we would
>
> > like
>
> > > not to depend on it in order to bring in as few third party deps as
>
> > > possible (and have smaller binaries).
>
> > > So I would cast a -1
>
> > > (also it would be super good to not have to depend
>
> > > on com.jayway.jsonpath:json-path, but that needed dependency was there
>
> > > before 1.26 and there is no need to change it now)
>
> > >
>
> > > I can try to work on this issue as soon as possible if you think it is
>
> > > worth it.
>
> > >
>
> > > Apart from that we are trying to make it work to switch to the "SEARCH"
>
> > > operator, that is a big behavior change.
>
> > > I guess there is no way to disable it. But I am fine with that, I knew
>
> > that
>
> > > it is only a matter of implementing that operator as well.
>
> > >
>
> > > Enrico
>
> > >
>
> > > Il giorno lun 5 ott 2020 alle ore 14:42 Vladimir Sitnikov <
>
> > > sitnikov.vladi...@gmail.com> ha scritto:
>
> > >
>
> > > > Thanks for preparing an RC.
>
> > > >
>
> > > > +1
>
> > > >
>
> > > > Build works for me modulo
>
> > > > https://issues.apache.org/jira/browse/CALCITE-2816 PsTableFunction
>
> > > > fails in Russian locale.
>
> > > >
>
> > > > Checksums match, signature validation passes.
>
> > > > The release notes look good.
>
> > > >
>
> > > > changelog> Compatibility: Guava versions 19.0 to 29.0-jre
>
> > > >
>
> > > > I see build scripts are using Guava 29.0-jre only. I guess Guava 19
>
> > > support
>
> > > > comes like "hope it works".
>
> > > >
>
> > > > Vladimir
>
> > > >
>
> > >
>
> >
>
>

Re: [DISCUSS] Towards Calcite 1.26.0

2020-10-03 Thread Danny Chan

AFAIK,

• CALCITE-4279: the SEARCH operator is a new one and causes the RexNode tree 
change which is breaking because downstream projects need to adapter for it
• CALCITE-2569: the UDF needs to implement a new interface SqlTableFunction
• CALCITE-3923: many singletons are removed and breaking


Best,
Danny Chan
在 2020年10月3日 +0800 AM1:12，Ruben Q L ，写道：
> I'm working on the release notes, any breaking change worth mentioning
> apart from https://issues.apache.org/jira/browse/CALCITE-2082 ?
>
>
> On Fri, Oct 2, 2020 at 11:59 AM Ruben Q L  wrote:
>
> > Thanks for the answer Vladimir.
> > 
> >
> > I have a tight schedule today. I hope to generate a RC0 today or worst
> > case scenario tomorrow.
> >
> > Best,
> > Ruben
> >
> >
> > On Fri, Oct 2, 2020 at 11:47 AM Vladimir Sitnikov <
> > sitnikov.vladi...@gmail.com> wrote:
> >
> > > Ruben>Could someone that has recently built a release using the new
> > > gradle-based
> > > Ruben>process confirm this point?
> > >
> > > oh, that was a question by Ruben, and I copy-pasted the wrong name :)
> > >
> > > Vladimir
> > >
> >

Re: [DISCUSS] Towards Calcite 1.26.0

2020-09-29 Thread Danny Chan

The schedule sounds good to me, thanks Ruben for being the release manager ~

Best,
Danny Chan
在 2020年9月30日 +0800 AM3:44，Julian Hyde ，写道：
> > What is the status on 4279 (SEARCH operator into Druid)?
>
> Working on it.
>
> > On Sep 29, 2020, at 11:58 AM, Ruben Q L  wrote:
> >
> > Thanks for the feedback, Julian.
> > I will keep in mind changing 'MacOS' to 'macOS' in the release notes.
> > What is the status on 4279 (SEARCH operator into Druid)?
> >
> > Ruben
> >
> >
> >
> > Le mar. 29 sept. 2020 à 19:51, Julian Hyde  a
> > écrit :
> >
> > > Thursday sounds good.
> > >
> > > I am working on getting 3752 (PIVOT) and 4238 (parser configuration)
> > > passing CI and merged. (4238 changes buildSrc and that seems to freak out
> > > Gradle’s cache in CI.)
> > >
> > > For 4034 (InnoDB adapter), I am waiting for Xu (neoremind) to fix a
> > > remaining bug. Don’t hold the release for 4034.
> > >
> > > As RM, can you please change ‘MacOS’ to ‘macOS’ (2 occurrences) when you
> > > write the release notes. It got changed by accident last release.
> > >
> > > Julian
> > >
> > >
> > > > On Sep 29, 2020, at 12:36 AM, Ruben Q L  wrote:
> > > >
> > > > Hello,
> > > >
> > > > things are moving forward, thanks everyone for pushing the pending 
> > > > issues
> > > > into the release.
> > > > There are still a few unresolved tickets [1], but most of them have a PR
> > > > available and in good shape, so I think we can be reasonably optimistic.
> > > > Would it sound feasible to get these final tickets resolved in the next
> > > > couple of days and enter code freeze for master e.g. on Thursday night 
> > > > in
> > > > order to have a first RC during the weekend?
> > > >
> > > > Best,
> > > > Ruben
> > > >
> > > > [1]
> > > >
> > > https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333950
> > > >
> > > >
> > > > Le ven. 25 sept. 2020 à 06:47, Chunwei Lei  a
> > > > écrit :
> > > >
> > > > > Thank you for driving this, Ruben.
> > > > >
> > > > > The schedule looks good to me.
> > > > >
> > > > > Best,
> > > > > Chunwei
> > > > >
> > > > >
> > > > > On Thu, Sep 24, 2020 at 3:26 AM Rui Wang  wrote:
> > > > >
> > > > > > Thanks for all your help! Will experiment how it works best for me!
> > > > > >
> > > > > >
> > > > > > -Rui
> > > > > >
> > > > > > On Wed, Sep 23, 2020 at 12:06 PM Ruben Q L  
> > > > > > wrote:
> > > > > >
> > > > > > > Rui,
> > > > > > > I am not a git expert, but what I usually do in these cases is
> > > (working
> > > > > > in
> > > > > > > my local feature branch):
> > > > > > > git rebase -i master
> > > > > > > (editor will be opened, "pick" first commit, "squash" the rest; 
> > > > > > > then
> > > > > edit
> > > > > > > your final squashed commit message)
> > > > > > > Then you can force push. In your PR it should appear 1 commit 
> > > > > > > instead
> > > > > of
> > > > > > N.
> > > > > > > This is just one way to do it, if you search online, you'll 
> > > > > > > probably
> > > > > find
> > > > > > > better explanations than mine for this "git squash commits" 
> > > > > > > operation.
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Ruben
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Le mer. 23 sept. 2020 à 19:50, Julian Hyde  a 
> > > > > > > écrit
> > > > > :
> > > > > > >
> > > > > > > > Feel free to experiment. If it creates a bad result - e.g. a 
> > > > > > > > merge
> > > > > > > > commit, or does not squash - quickly back it out with a force 
> > > > > > > > push to
>

Re: [DISCUSS] The date to release Calcite 1.26.0

2020-09-27 Thread Danny Chan

Hi, Ruben, there are few issues left [1] for nest release 1.26.0, if we want to 
fire a rc in 3th October, we need to make clear which one should be included 
and the progress of each issue (whether needs a reviewer, when can be merged 
and whether it is suitable for the release time-line).

[1] 
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CALCITE%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%201.26.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC

Best,
Danny Chan
在 2020年9月28日 +0800 AM10:49，Danny Chan ，写道：
> I will do that.
>
> Best,
> Danny Chan
> 在 2020年9月28日 +0800 AM3:35，James Starr ，写道：
> > Could I get CALCITE-4210 included in release 1.26.0?
> >
> > On Fri, Sep 25, 2020 at 1:50 AM Danny Chan  wrote:
> >
> > > Thanks. I will prepare my local test as fast as possible.
> > >
> > > Ruben Q L 于2020年9月25日 周五下午4:30写道：
> > >
> > > > Danny,
> > > >
> > > > I will do my best to have a RC by the end of next week (this is my first
> > > >
> > > > time as release manager, I hope I do not find any setbacks in the
> > > process).
> > > >
> > > > Unfortunately I cannot promise a date for the release (I think nobody
> > > can),
> > > >
> > > > because other factors need to be considered (the vote, if several RCs
> > > need
> > > >
> > > > to be generated until one gets finally approved, etc). Having said that,
> > > >
> > > > 8th October seems reasonable, but again, I cannot promise anything.
> > > >
> > > >
> > > >
> > > > Best regards,
> > > >
> > > > Ruben
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Le ven. 25 sept. 2020 à 09:05, Danny Chan  a
> > > écrit :
> > > >
> > > >
> > > >
> > > > > Ruben , can we make sure that the next release happens before 8th
> > > > October?
> > > >
> > > > > I would do local tests for it before that.
> > > >
> > > > >
> > > >
> > > > > Ruben Q L 于2020年9月24日 周四上午2:32写道：
> > > >
> > > > >
> > > >
> > > > > > Allright, I will send a separate email to start the "official"
> > > > discussion
> > > >
> > > > > >
> > > >
> > > > > > for 1.26.0.
> > > >
> > > > > >
> > > >
> > > > > >
> > > >
> > > > > >
> > > >
> > > > > > Best,
> > > >
> > > > > >
> > > >
> > > > > > Ruben
> > > >
> > > > > >
> > > >
> > > > > >
> > > >
> > > > > >
> > > >
> > > > > >
> > > >
> > > > > >
> > > >
> > > > > > Le mer. 23 sept. 2020 à 17:17, Julian Hyde  a
> > > écrit
> > > > :
> > > >
> > > > > >
> > > >
> > > > > >
> > > >
> > > > > >
> > > >
> > > > > > > I agree, this is a good time for a release. Remember that releases
> > > >
> > > > > >
> > > >
> > > > > > > always take a week or two longer than we think.
> > > >
> > > > > >
> > > >
> > > > > > >
> > > >
> > > > > >
> > > >
> > > > > > > Danny, I need your help on
> > > >
> > > > > >
> > > >
> > > > > > > https://issues.apache.org/jira/browse/CALCITE-4259 (supporting JDK
> > > > 15
> > > >
> > > > > >
> > > >
> > > > > > > and removing log4j), which is a must-fix for 1.26.
> > > >
> > > > > >
> > > >
> > > > > > >
> > > >
> > > > > >
> > > >
> > > > > > > Julian
> > > >
> > > > > >
> > > >
> > > > > > >
> > > >
> > > > > >
> > > >
> > > > > > >
> > > >
> > > > > &g

Re: [DISCUSS] The date to release Calcite 1.26.0

2020-09-25 Thread Danny Chan

Thanks. I will prepare my local test as fast as possible.

Ruben Q L 于2020年9月25日 周五下午4:30写道：

> Danny,
>
> I will do my best to have a RC by the end of next week (this is my first
>
> time as release manager, I hope I do not find any setbacks in the process).
>
> Unfortunately I cannot promise a date for the release (I think nobody can),
>
> because other factors need to be considered (the vote, if several RCs need
>
> to be generated until one gets finally approved, etc). Having said that,
>
> 8th October seems reasonable, but again, I cannot promise anything.
>
>
>
> Best regards,
>
> Ruben
>
>
>
>
>
> Le ven. 25 sept. 2020 à 09:05, Danny Chan  a écrit :
>
>
>
> > Ruben , can we make sure that the next release happens before 8th
> October?
>
> > I would do local tests for it before that.
>
> >
>
> > Ruben Q L 于2020年9月24日 周四上午2:32写道：
>
> >
>
> > > Allright, I will send a separate email to start the "official"
> discussion
>
> > >
>
> > > for 1.26.0.
>
> > >
>
> > >
>
> > >
>
> > > Best,
>
> > >
>
> > > Ruben
>
> > >
>
> > >
>
> > >
>
> > >
>
> > >
>
> > > Le mer. 23 sept. 2020 à 17:17, Julian Hyde  a écrit
> :
>
> > >
>
> > >
>
> > >
>
> > > > I agree, this is a good time for a release. Remember that releases
>
> > >
>
> > > > always take a week or two longer than we think.
>
> > >
>
> > > >
>
> > >
>
> > > > Danny, I need your help on
>
> > >
>
> > > > https://issues.apache.org/jira/browse/CALCITE-4259 (supporting JDK
> 15
>
> > >
>
> > > > and removing log4j), which is a must-fix for 1.26.
>
> > >
>
> > > >
>
> > >
>
> > > > Julian
>
> > >
>
> > > >
>
> > >
>
> > > >
>
> > >
>
> > > > On Wed, Sep 23, 2020 at 2:30 AM Danny Chan 
>
> > wrote:
>
> > >
>
> > > > >
>
> > >
>
> > > > > Thanks so much, Ruben ~
>
> > >
>
> > > > >
>
> > >
>
> > > > > Can the RC date be one week in advance, say about 3th-8th October,
>
> > >
>
> > > > because I need some time for the upgrade of Flink side.
>
> > >
>
> > > > >
>
> > >
>
> > > > > Before the RC, I need amount of local tests for Flink and I would
>
> > have
>
> > >
>
> > > > to make sure that the 1.26.0 is production ready for the nest Flink
>
> > > release.
>
> > >
>
> > > > >
>
> > >
>
> > > > > And it would be great if we can make this sooner, this is important
>
> > >
>
> > > > because I can fire the upgrade PR sooner and actually we have many
>
> > > features
>
> > >
>
> > > > developed based on current master branch.
>
> > >
>
> > > > >
>
> > >
>
> > > > > Best,
>
> > >
>
> > > > > Danny Chan
>
> > >
>
> > > > > 在 2020年9月23日 +0800 PM4:15，Ruben Q L ，写道：
>
> > >
>
> > > > > > Hello Danny,
>
> > >
>
> > > > > >
>
> > >
>
> > > > > > As the release manager for 1.26.0, I think what you say sounds
>
> > >
>
> > > > reasonable.
>
> > >
>
> > > > > > We had 1.25 around mid-August, I think we could definitely aim
>
> > > mid/end
>
> > >
>
> > > > > > October for 1.26. I would propose to have a RC by 10th-15th
>
> > October.
>
> > >
>
> > > > > > According to the Jira dashboard [1] there are still 6 unresolved
>
> > >
>
> > > > issues,
>
> > >
>
> > > > > > let us try to focus on that before adding new stuff into the next
>
> > >
>
> > > > version.
>
> > >
>
> > > > > >
>
> > >
>
> > > > > > Best regards,
>
> > >
>
> > > > > > Ruben
>
> > >
>
> > > > > >
>
> > >
>
> > > > > > [1]
>
> > >
>
> > > > > >
>
> > >
>
> > > >
>
> > >
>
> >
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333950
>
> > >
>
> > > > > >
>
> > >
>
> > > > > >
>
> > >
>
> > > > > > Le mer. 23 sept. 2020 à 06:31, Danny Chan 
> a
>
> > >
>
> > > > écrit :
>
> > >
>
> > > > > >
>
> > >
>
> > > > > > > Is there any possibility that 1.26.0 can be released in the
>
> > > following
>
> > >
>
> > > > > > > weeks, e.g. within the October ?
>
> > >
>
> > > > > > >
>
> > >
>
> > > > > > > Apache Flink would release next major version in the end of
>
> > > October,
>
> > >
>
> > > > if we
>
> > >
>
> > > > > > > can use the latest Calcite code, that would be awesome ~
>
> > >
>
> > > > > > >
>
> > >
>
> > > > > > > Best,
>
> > >
>
> > > > > > > Danny Chan
>
> > >
>
> > > > > > >
>
> > >
>
> > > >
>
> > >
>
> > >
>
> >
>
>

Re: [DISCUSS] The date to release Calcite 1.26.0

2020-09-25 Thread Danny Chan

Ruben , can we make sure that the next release happens before 8th October?
I would do local tests for it before that.

Ruben Q L 于2020年9月24日 周四上午2:32写道：

> Allright, I will send a separate email to start the "official" discussion
>
> for 1.26.0.
>
>
>
> Best,
>
> Ruben
>
>
>
>
>
> Le mer. 23 sept. 2020 à 17:17, Julian Hyde  a écrit :
>
>
>
> > I agree, this is a good time for a release. Remember that releases
>
> > always take a week or two longer than we think.
>
> >
>
> > Danny, I need your help on
>
> > https://issues.apache.org/jira/browse/CALCITE-4259 (supporting JDK 15
>
> > and removing log4j), which is a must-fix for 1.26.
>
> >
>
> > Julian
>
> >
>
> >
>
> > On Wed, Sep 23, 2020 at 2:30 AM Danny Chan  wrote:
>
> > >
>
> > > Thanks so much, Ruben ~
>
> > >
>
> > > Can the RC date be one week in advance, say about 3th-8th October,
>
> > because I need some time for the upgrade of Flink side.
>
> > >
>
> > > Before the RC, I need amount of local tests for Flink and I would have
>
> > to make sure that the 1.26.0 is production ready for the nest Flink
> release.
>
> > >
>
> > > And it would be great if we can make this sooner, this is important
>
> > because I can fire the upgrade PR sooner and actually we have many
> features
>
> > developed based on current master branch.
>
> > >
>
> > > Best,
>
> > > Danny Chan
>
> > > 在 2020年9月23日 +0800 PM4:15，Ruben Q L ，写道：
>
> > > > Hello Danny,
>
> > > >
>
> > > > As the release manager for 1.26.0, I think what you say sounds
>
> > reasonable.
>
> > > > We had 1.25 around mid-August, I think we could definitely aim
> mid/end
>
> > > > October for 1.26. I would propose to have a RC by 10th-15th October.
>
> > > > According to the Jira dashboard [1] there are still 6 unresolved
>
> > issues,
>
> > > > let us try to focus on that before adding new stuff into the next
>
> > version.
>
> > > >
>
> > > > Best regards,
>
> > > > Ruben
>
> > > >
>
> > > > [1]
>
> > > >
>
> >
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333950
>
> > > >
>
> > > >
>
> > > > Le mer. 23 sept. 2020 à 06:31, Danny Chan  a
>
> > écrit :
>
> > > >
>
> > > > > Is there any possibility that 1.26.0 can be released in the
> following
>
> > > > > weeks, e.g. within the October ?
>
> > > > >
>
> > > > > Apache Flink would release next major version in the end of
> October,
>
> > if we
>
> > > > > can use the latest Calcite code, that would be awesome ~
>
> > > > >
>
> > > > > Best,
>
> > > > > Danny Chan
>
> > > > >
>
> >
>
>

Re: [ DISCUSS] The window table functions join syntax

2020-09-23 Thread Danny Chan

Thanks for the feedback, I agree we should keep the verbose part

**L.window_start = R.window_start AND L.window_end =R.window_end**

Which would make the semantic more clear ~

Best,
Danny Chan
在 2020年9月23日 +0800 PM3:24，Viliam Durina ，写道：
> You can also use
>
> SELECT L.f0, R.f2, L.window_start, L.window_end
> FROM
> Tumble(table T1, descriptor(T1.ts), INTERVAL ‘5’ MINUTE) L
> JOIN
> Tumble(table T2, descriptor(T2.ts), INTERVAL ‘5’ MINUTE) R
> USING (f0, window_start)
>
> Viliam
>
> On Wed, 23 Sep 2020 at 08:02, Rui Wang  wrote:
>
> > Regarding to **L.window_start = R.window_start AND L.window_end =
> > R.window_end**:
> >
> > In general, the current table function windowing model is to append window
> > metadata to table directly, thus window metadata becomes a part of table
> > (or call it data). So as a part of table, these two columns should be
> > treated as normal columns thus they should be in the join on condition.
> >
> > If you want to make it optional, it makes window start/end columns special
> > and has a semantic binding with special table functions (TUMBLE, HOP,
> > SESSION), which then becomes really not a SQL thing. For example, we can
> > allow users to define their own windowing table function. In that case, how
> > will you utilize window start/end produced by a customized windowing table
> > function? What if users produce wired windows that have overlapped window
> > starts or window ends?
> >
> > Keeping windows start/end as a part of the table, treating them no
> > different from other columns, could give a consistent behavior for either
> > built-in table function or user-defined table function.
> >
> > If you think it is too verbose, there are two options to optimize:
> >
> > 1. for TUMBLE/HOP/SESSION, to identify a unique window, you will only need
> > either window start or end, so you can simplify it, for example, to
> > L.window_start = R.window_start only.
> > 2. (not recommended), you can cut off **L.window_start = R.window_start AND
> > L.window_end = R.window_end**, but add window metadata comparison to join
> > implicitly by execution engine. E.g. you can make up the join condition in
> > your JoinRel if two inputs are TUMBLE.
> >
> >
> >
> > -Rui
> >
> >
> >
> >
> > On Tue, Sep 22, 2020 at 10:27 PM Danny Chan  wrote:
> >
> > > Yes, the red part is **L.window_start = R.window_start AND L.window_end =
> > > R.window_end**
> > >
> > > > Is this a limitation for "triggered by the watermark of the stream”?
> > >
> > > No, because in most of the cases, there is no need to output the
> > > intermediate/partial join records then send retractions.
> > >
> > >
> > > So, how do you think about the condition syntax **L.window_start =
> > > R.window_start AND L.window_end = R.window_end** ?
> > >
> > > Best,
> > > Danny Chan
> > > 在 2020年9月23日 +0800 PM12:47，dev@calcite.apache.org，写道：
> > > >
> > > > L.window_start = R.window_start AND L.window_end = R.window_end
> > >
> >
>
>
> --
> Viliam Durina
> Jet Developer
> hazelcast®
>
> <https://www.hazelcast.com> 2 W 5th Ave, Ste 300 | San Mateo, CA 94402 |
> USA
> +1 (650) 521-5453 | hazelcast.com <https://www.hazelcast.com>
>
> --
> This message contains confidential information and is intended only for the
> individuals named. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately by e-mail if you have received this e-mail by mistake and
> delete this e-mail from your system. E-mail transmission cannot be
> guaranteed to be secure or error-free as information could be intercepted,
> corrupted, lost, destroyed, arrive late or incomplete, or contain viruses.
> The sender therefore does not accept liability for any errors or omissions
> in the contents of this message, which arise as a result of e-mail
> transmission. If verification is required, please request a hard-copy
> version. -Hazelcast

Re: [DISCUSS] The date to release Calcite 1.26.0

2020-09-23 Thread Danny Chan

Thanks so much, Ruben ~

Can the RC date be one week in advance, say about 3th-8th October, because I 
need some time for the upgrade of Flink side.

Before the RC, I need amount of local tests for Flink and I would have to make 
sure that the 1.26.0 is production ready for the nest Flink release.

And it would be great if we can make this sooner, this is important because I 
can fire the upgrade PR sooner and actually we have many features developed 
based on current master branch.

Best,
Danny Chan
在 2020年9月23日 +0800 PM4:15，Ruben Q L ，写道：
> Hello Danny,
>
> As the release manager for 1.26.0, I think what you say sounds reasonable.
> We had 1.25 around mid-August, I think we could definitely aim mid/end
> October for 1.26. I would propose to have a RC by 10th-15th October.
> According to the Jira dashboard [1] there are still 6 unresolved issues,
> let us try to focus on that before adding new stuff into the next version.
>
> Best regards,
> Ruben
>
> [1]
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333950
>
>
> Le mer. 23 sept. 2020 à 06:31, Danny Chan  a écrit :
>
> > Is there any possibility that 1.26.0 can be released in the following
> > weeks, e.g. within the October ?
> >
> > Apache Flink would release next major version in the end of October, if we
> > can use the latest Calcite code, that would be awesome ~
> >
> > Best,
> > Danny Chan
> >

[DISCUSS] The date to release Calcite 1.26.0

2020-09-22 Thread Danny Chan

Is there any possibility that 1.26.0 can be released in the following weeks, 
e.g. within the October ?

Apache Flink would release next major version in the end of October, if we can 
use the latest Calcite code, that would be awesome ~

Best,
Danny Chan

Re: [ DISCUSS] The window table functions join syntax

2020-09-22 Thread Danny Chan

Yes, the red part is **L.window_start = R.window_start AND L.window_end = 
R.window_end**

> Is this a limitation for "triggered by the watermark of the stream”?

No, because in most of the cases, there is no need to output the 
intermediate/partial join records then send retractions.


So, how do you think about the condition syntax **L.window_start = 
R.window_start AND L.window_end = R.window_end** ?

Best,
Danny Chan
在 2020年9月23日 +0800 PM12:47，dev@calcite.apache.org，写道：
>
> L.window_start = R.window_start AND L.window_end = R.window_end

[ DISCUSS] The window table functions join syntax

2020-09-22 Thread Danny Chan

Hi, fellows, long time no see on the mailing ~

Here I want to have a discussion on the join syntax of our recently introduced 
window table function ~

For example, we can define a tumbling window function of 5 minutes size as:

Tumble(table T, descriptor(T.ts), INTERVAL ‘5’ MINUTE)

The we can select from it, and moreover, I want to support 2 window function 
join for the streaming query recently.

The semantics of the windowed stream join is:

• The 2 window inputs should have the same window arguments (except for the 
table name), e.g. for TUMBLE the size should be equal, for HOP, both the side 
interval and size should be equal
• We first window the input stream then join the both window data set of the 
same TimeWindow
• The Join action is triggered by the watermark of the stream
• The join does not produce retractions of the stream, the mainly difference 
with normal two-stream join

And I want to propose a join syntax as:

Select L.f0, R.f2, L.window_start, L.window_end
FROM
Tumble(table T1, descriptor(T1.ts), INTERVAL ‘5’ MINUTE) L
JOIN
Tumble(table T2, descriptor(T2.ts), INTERVAL ‘5’ MINUTE) R
ON
L.f0 = R.f0 AND L.window_start = R.window_start AND L.window_end = R.window_end

The red syntax part is what I want to discuss, the condition seems too verbose 
because
user need to declare it every time.


• Should we make it optional ?
• Is there better syntax to describe this window join semantics ?


Best,
Danny Chan

Re: [ANNOUNCE] New committer: Rui Wang

2020-09-09 Thread Danny Chan

Congrats Rui!

Best,
Danny Chan



> 在 2020年9月10日，上午6:28，Francis Chuang  写道：
> 
> Congrats Rui!

Re: [ANNOUNCE] Ruben Quesada Lopez joins Calcite PMC

2020-08-11 Thread Danny Chan

Congratulations ~

Best,
Danny Chan
在 2020年8月12日 +0800 AM9:57，chunwei ，写道：
> Congrats, Ruben!
>
> On Wed, Aug 12, 2020 at 8:58 AM XING JIN  wrote:
>
> > Congrats, Ruben!
> >
> > 953396112 <953396...@qq.com> 于2020年8月12日周三 上午7:47写道：
> >
> > > Congratulations,Ruben!
> > >
> > >
> > > xzh
> > > --原始邮件--
> > > 发件人:
> > > "dev"
> > > <
> > > zabe...@gmail.com;
> > > 发送时间:2020年8月12日(星期三) 凌晨5:53
> > > 收件人:"dev" > >
> > > 主题:[ANNOUNCE] Ruben Quesada Lopez joins Calcite PMC
> > >
> > >
> > >
> > > I'm pleased to announce that Ruben has accepted an invitation to
> > > join the Calcite PMC. Ruben has been a consistent and helpful
> > > figure in the Calcite community for which we are very grateful. We
> > > look forward to the continued contributions and support.
> > >
> > > Please join me in congratulating Ruben!
> > >
> > > - Stamatis (on behalf of the Calcite PMC)
> >

Re: [DISCUSS] Publish test JARS

2020-08-11 Thread Danny Chan

Michael, do you plan to push this function/patch ? We actually need this too ~

Best,
Danny Chan
在 2020年8月11日 +0800 AM3:06，Michael Mior ，写道：
> Thanks for pointing back to the previous discussion. I'm fine with
> publishing separate modules. No real preference on my end since I
> haven't found consuming test artifacts to cause any problems. Although
> I'm not sure what the changes to the Gradle config would look like to
> support this.
>
> --
> Michael Mior
> mm...@apache.org
>
> Le lun. 10 août 2020 à 14:49, Vladimir Sitnikov
>  a écrit :
> >
> > Hi Michael,
> >
> > I suggest we go with adding explicitly published modules rather than
> > publishing tests.
> > Test artifacts do not have their own pom.xml, so they are not really
> > convenient for consumers.
> >
> > Here's the relevant thread:
> > https://lists.apache.org/thread.html/2ab9652ea855dce0b5b75cc221a5f74ebec536818847ae5ce6d284f5%40%3Cdev.calcite.apache.org%3E
> >
> > I guess the naming could be calcite-testkit or calcite-test-framework.
> >
> > Vladimir

Re: Pushing down aggregates through rhs of a left join

2020-08-03 Thread Danny Chan

I didn’t see the derivation formula yet but I believe there is indeed some 
promotion space for the Agg Join transpose cases, Alex, can you log an issue 
there ?

Best,
Danny Chan
在 2020年7月31日 +0800 PM1:43，Alex Baden ，写道：
> Hi all,
>
> I have a query of the form:
>
> SELECT a.x, SUM(b.y), SUM(b.z) FROM t1 a LEFT JOIN t2 b ON a.join_key
> = b.join_key GROUP BY a.x ORDER BY a.x;
>
> If table b has a large number of duplicate keys for `join_key`, the
> left join can be very expensive to compute. Instead, we would like to
> run:
>
> WITH t2g AS (SELECT b.join_key, SUM(b.y), SUM(b.z) FROM t2 b GROUP BY
> b.join_key)
> SELECT a.x, SUM(bg.y), SUM(bg.z) FROM t1 a LEFT JOIN t2g bg ON
> a.join_key = t2g.join_key GROUP BY a.x ORDER BY a.x;
>
> Essentially, since we are only projecting aggregates from the rhs of
> the join, and the aggregate functions are associative, we can group by
> the join key to compute the aggregates up front, then join on the
> grouped results, and finally aggregate among join matches.
>
> Looking at the comments of the AGGREGATE_JOIN_TRANSPOSE_RULE, I noted
> the following:
> // OUTER joins are supported for group by without aggregate functions
>
> But based on the above, if we have a left join I believe we can
> transpose the aggregate and the join if the following conditions hold:
> 1) only expressions from the rhs of the join are aggregated
> 2) all aggregate functions from (1) are associative (can be split)
> 3) at least one expression from the lhs of the join is grouped
>
> I am interested in implementing this rule (assuming the conditions
> above are strong enough to guarantee correctness). Is there interest
> in PRing something like this to calcite, either as a new rule or part
> of AggregateJoinTranspose?
>
> Thanks,
> Alex

Re: [DISCUSSION] Rename master branch to main

2020-07-28 Thread Danny Chan

As a Chinsese, I didn’t understand quite well why the word “master” can be 
“slavery”. I often see it as the similiar meaning as “main”, it seems to take 
some time to adapt to new term “main” because I believe most of the developers 
got used to the word “master”.

> I think this is a relatively low impact change that can potentially
> make us even more welcoming to new contributors, which is a benefit to
> us all :)

Is this true ? People would always contribute to Calcite if they need to, 
apparently not just because of a branch name.

Best,
Danny Chan
在 2020年7月29日 +0800 AM7:08，Michael Mior ，写道：
> Actually, the argument that the term "master" in git didn't originate
> from master/slave is not true. See the article I linked earlier. In
> any case, I don't think the change hurts anyone other than a brief
> annoyance when we all have to change our branch name and if it makes
> the project more welcoming to someone, than great.
>
> --
> Michael Mior
> mm...@apache.org
>
>
> Le mar. 28 juil. 2020 à 17:29, Julian Hyde  a écrit :
> >
> > I agree with you. It’s probably derived from “master” as in the “gold 
> > master” [1] which is the mix from which a sound engineer would cut a record 
> > or CD. And who knows where that term came from?
> >
> > But in the end, the origin of the term is irrelevant. The current name is, 
> > or may be, unwelcoming to some people, so let’s just move on.
> >
> > Julian
> >
> > [1] https://en.wikipedia.org/wiki/Mastering_(audio) 
> > <https://en.wikipedia.org/wiki/Mastering_(audio)>
> >
> > > On Jul 28, 2020, at 1:56 PM, Viliam Durina  wrote:
> > >
> > > It's not a term related to slavery, it has much broader meaning than 
> > > "slave
> > > owner", but any argument is probably vain.
> > >
> > > On Tue, 28 Jul 2020 at 19:43, Julian Hyde  wrote:
> > >
> > > > I am in favor of renaming ‘master’ to ‘main’. To most people it doesn’t
> > > > make any difference. To some, such as potential members currently 
> > > > outside
> > > > the community, it makes the project more welcoming.
> > > >
> > > > Very little effort or disruption is required. We’ve identified a 
> > > > potential
> > > > source of friction, so let’s fix it and move on.
> > > >
> > > > Julian
> > > >
> > > > > On Jul 28, 2020, at 10:31 AM, Michael Mior  wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > You can find some background on this discussion at the link below [0].
> > > > > This is a topic that has come up regularly among D folks at the ASF.
> > > > > The short summary is that the term "master" when referring to a git
> > > > > branch is a reference to terminology related to slavery. I'm
> > > > > suggesting main because this seems to be what the developer community
> > > > > as a whole is gravitating towards. See for example, GitHub's public
> > > > > roadmap [1] where there are plans to make this change.
> > > > >
> > > > > I'm hoping that this discussion can be focused not on whether anyone
> > > > > has been impacted by such terminology, but how we can move forward. I
> > > > > personally believe that if a single person feels more welcome to
> > > > > contribute because of the change, it's a win. I also don't think
> > > > > making this change needs to be painful. (There are less than 20
> > > > > relevant references to "master" in the Calcite code.) Apache Mahout
> > > > > and I believe others have already made this change.
> > > > >
> > > > > I think this is a relatively low impact change that can potentially
> > > > > make us even more welcoming to new contributors, which is a benefit to
> > > > > us all :)
> > > > >
> > > > > [0]
> > > > http://www.kapwing.com/blog/how-to-rename-your-master-branch-to-main-in-git/
> > > > > [1] https://github.com/github/roadmap/issues/63
> > > > >
> > > > > --
> > > > > Michael Mior
> > > > > mm...@apache.org
> > > >
> > > >
> > >
> > > --
> > > Viliam Durina
> > > Jet Developer
> > > hazelcast®
> > >
> > > <https://www.hazelcast.com> 2 W 5th Ave, Ste 300 | San Mateo, CA 94402 |
> > > USA
> > > +1 (650) 521-5453 | hazelcast.com <https://www.hazelcast.com>
> > >
> > > --
> > > This message contains confidential information and is intended only for 
> > > the
> > > individuals named. If you are not the named addressee you should not
> > > disseminate, distribute or copy this e-mail. Please notify the sender
> > > immediately by e-mail if you have received this e-mail by mistake and
> > > delete this e-mail from your system. E-mail transmission cannot be
> > > guaranteed to be secure or error-free as information could be intercepted,
> > > corrupted, lost, destroyed, arrive late or incomplete, or contain viruses.
> > > The sender therefore does not accept liability for any errors or omissions
> > > in the contents of this message, which arise as a result of e-mail
> > > transmission. If verification is required, please request a hard-copy
> > > version. -Hazelcast
> >

Re: Exception:Catch clause is unreachable

2020-07-24 Thread Danny Chan

Is there any possibility that you gave an re-producable test case in the 
JdbcTest and log an issue there ? So that the volunteers can contribute the fix 
if they want to.

Best,
Danny Chan
在 2020年7月24日 +0800 PM2:05，tonytao ，写道：
> I had found the empty result was happened at select validate,it could
> not get column datatype from jdbc connection metadata .
>
> I changed the query,it throw exception about "Column 'TS' not found in
> any table":
>         String url="jdbc:TAOS://127.0.0.1:6030/hdb";
> DriverManager.registerDriver(DriverManager.getDriver(url));
>         Class.forName("org.postgresql.Driver");
>         BasicDataSource dataSource = new BasicDataSource();
>         ...
>         Schema schema = JdbcSchema.create(rootSchema, "test", dataSource,
>                 "hdb", null);
>         rootSchema.add("test", schema);
>         Statement stat = calciteConnection.createStatement();
>         ResultSet rs = stat.executeQuery("select ts,speed from test.t");
>         while(rs.next()) {
>             System.out.println(rs.getObject(2));
>         }
>
> I debug this code step by step,found this happened at JdbcSchema.java
> line 368,which  function  is :getRelDataType(DatabaseMetaData metaData,
> String catalogName,
>   String schemaName, String tableName)
> the catalogName and schemaName are both null,tableName is "t".
> and it execute
>     final ResultSet resultSet =
>     metaData.getColumns(catalogName, schemaName, tableName, null);
> to get the columns.
>
> *But I don't know why **catalogName is null,I think it should be "hdb"*
>
> I checked the object JdbcTable,its property jdbcCatalogName  was also
> null,while jdbcSchema.catalog was "hdb".
>
> Is this a bug? The catalog assigned when JdbcSchema created was not
> assigned to the JdbcTable.
>
> this is the stack trace:
> Thread [main] (Suspended)
>     owns: CalciteJdbc41Factory$CalciteJdbc41Statement  (id=40)
>     JdbcSchema.getRelDataType(DatabaseMetaData, String, String, String)
> line: 370
>     JdbcSchema.getRelDataType(String, String, String) line: 362
>     JdbcTable.getRowType(RelDataTypeFactory) line: 117
>     EmptyScope.resolve_(CalciteSchema, List, List,
> SqlNameMatcher, Path, Resolved) line: 159
>     EmptyScope.resolveTable(List, SqlNameMatcher, Path,
> Resolved) line: 99
>     CatalogScope(DelegatingScope).resolveTable(List,
> SqlNameMatcher, Path, Resolved) line: 203
>     IdentifierNamespace.resolveImpl(SqlIdentifier) line: 112
>     IdentifierNamespace.validateImpl(RelDataType) line: 184
>     IdentifierNamespace(AbstractNamespace).validate(RelDataType) line: 84
> CalciteSqlValidator(SqlValidatorImpl).validateNamespace(SqlValidatorNamespace,
> RelDataType) line: 1110
>     CalciteSqlValidator(SqlValidatorImpl).validateQuery(SqlNode,
> SqlValidatorScope, RelDataType) line: 1084
>     CalciteSqlValidator(SqlValidatorImpl).validateFrom(SqlNode,
> RelDataType, SqlValidatorScope) line: 3256
>     CalciteSqlValidator(SqlValidatorImpl).validateFrom(SqlNode,
> RelDataType, SqlValidatorScope) line: 3238
> CalciteSqlValidator(SqlValidatorImpl).validateSelect(SqlSelect,
> RelDataType) line: 3510
>     SelectNamespace.validateImpl(RelDataType) line: 60
>     SelectNamespace(AbstractNamespace).validate(RelDataType) line: 84
> CalciteSqlValidator(SqlValidatorImpl).validateNamespace(SqlValidatorNamespace,
> RelDataType) line: 1110
>     CalciteSqlValidator(SqlValidatorImpl).validateQuery(SqlNode,
> SqlValidatorScope, RelDataType) line: 1084
>     SqlSelect.validate(SqlValidator, SqlValidatorScope) line: 232
> CalciteSqlValidator(SqlValidatorImpl).validateScopedExpression(SqlNode,
> SqlValidatorScope) line: 1059
>     CalciteSqlValidator(SqlValidatorImpl).validate(SqlNode) line: 766
>     SqlToRelConverter.convertQuery(SqlNode, boolean, boolean) line: 563
> CalcitePrepareImpl$CalcitePreparingStmt(Prepare).prepareSql(SqlNode,
> SqlNode, Class, SqlValidator, boolean) line: 242
> CalcitePrepareImpl$CalcitePreparingStmt(Prepare).prepareSql(SqlNode,
> Class, SqlValidator, boolean) line: 208
>     CalcitePrepareImpl.prepare2_(Context, Query, Type, long,
> CalciteCatalogReader, RelOptPlanner) line: 632
>     CalcitePrepareImpl.prepare_(Context, Query, Type, long) line: 498
>     CalcitePrepareImpl.prepareSql(Context, Query, Type, long) line: 468
> CalciteJdbc41Factory$CalciteJdbc41Connection(CalciteConnectionImpl).parseQuery(Query,
> Context, long) line: 231
>     CalciteMetaImpl.prepareAndExecute(Meta$StatementHandle, String,
> long, int, Meta$PrepareCallback) line: 552
> CalciteJdbc41Factory$CalciteJdbc41Connection(AvaticaConnection).prepareAndExecuteInternal(AvaticaStatement,

Re: Re: Exception:Catch clause is unreachable

2020-07-23 Thread Danny Chan

In general, Calcite use janino to compile the generated Java string codes into 
real instances during the query execution, you error throws because the code 
generation generates java file with invalid syntax.

Things are not bad now, you can set up a break point in the debugger, and all 
the generated code would be in the directory you have configured, here, you can 
see the error file path is:

> File '/home/tony/workspace/tmp/janino7222401528383593043.java', Line 8, 
> Column 20:

The generated files names may change each time you debug it, and disappear when 
you stop debugging, so you need to copy the files out and past it into the IDEA 
again to see which file has the Java syntax error.

Best,
Danny Chan
在 2020年7月23日 +0800 PM4:05，tonytao ，写道：
> hi Danny,
>
> Thank you for your kindly help.
>
> Attach is the generated code.but I had no idea how this file generated or fix 
> the bug.
> I'm sorry that I could not receive your reply with my work email account,so I 
> replied with my personal account.
>
> errlog:
>at org.apache.calcite.avatica.Helper.wrap(Helper.java:37)
>at 
> org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:128)
>at 
> org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:)
>at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:309)
>at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:208)
>at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:632)
>at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:498)
>at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:468)
>at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231)
>at 
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:552)
>at 
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>... 2 more
> Caused by: org.codehaus.commons.compiler.CompileException: File 
> '/home/tony/workspace/tmp/janino7222401528383593043.java', Line 8, Column 20: 
> Catch clause is unreachable
>at 
> org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12211)
>at 
> org.codehaus.janino.UnitCompiler.compileTryCatch(UnitCompiler.java:3110)
>at 
> org.codehaus.janino.UnitCompiler.compileTryCatchFinally(UnitCompiler.java:2966)
>at 
> org.codehaus.janino.UnitCompiler.compileTryCatchFinallyWithResources(UnitCompiler.java:2770)
>
>
> Thanks!
>
> It seems that you got some problem with the generated code, take this [1] to 
> debug the code
> and to see which line caused the compilation error.
>
> [1] 
> https://calcite.apache.org/docs/howto.html#debugging-generated-classes-in-intellij
>
> Best,
> Danny Chan
> 在 2020年7月22日 +0800 AM12:17，taojin ，写道：
> > hi folks,
> >
> > I has met a compiler error on calcite-core:1.23.0,jdk version is
> > :openjdk version "11.0.7" 2020-04-14.
> >
> > Here is the error log,is this a bug?
> >
> > Exception in thread "main" java.sql.SQLException: Error while executing
> > SQL "select * from test.t": Error while compiling generated Java code:
> > public org.apache.calcite.linq4j.Enumerable bind(final
> > org.apache.calcite.DataContext root) {
> >   final org.apache.calcite.linq4j.function.Function1 rowBuilderFactory
> > = new org.apache.calcite.linq4j.function.Function1() {
> > public org.apache.calcite.linq4j.function.Function0 apply(final
> > java.sql.ResultSet resultSet) {
> >   return new org.apache.calcite.linq4j.function.Function0() {
> >   public Object apply() {
> > try {
> >   return new Object[0];
> > } catch (java.sql.SQLException e) {
> >   throw new RuntimeException(
> > e);
> > }
> >   }
> > }
> >   ;
> > }
> > public Object apply(final Object resultSet) {
> >   return apply(
> > (java.sql.ResultSet) resultSet);
> > }
> >   }
> >   ;
> >   final org.apache.calcite.runtime.ResultSetEnumerable enumerable =
> > org.apache.calcite.runtime.ResultSetEnumerable.of((javax.sql.DataSource)
> > root.getRootSchema().getSubSchema("test").unw

Re: [VOTE] Release apache-calcite-1.24.0 (release candidate 0)

2020-07-22 Thread Danny Chan

Thanks Chunwei for taking care of the release ~

+1 (binding)
- verified hashes and checksums
- built from sources and run tests (1.8.0_161 on MacOS Mojave)
- run some tests of Flink

Best,
Danny Chan
在 2020年7月22日 +0800 PM11:50，dev@calcite.apache.org，写道：
>
> +1 (non binding)
> - verified hashes and checksums
> - built from sources and run tests (JDK14 on Linux)
> - run tests of HerdDB and some client application

Re: Exception:Catch clause is unreachable

2020-07-21 Thread Danny Chan

It seems that you got some problem with the generated code, take this [1] to 
debug the code and to see which line caused the compilation error.

[1] 
https://calcite.apache.org/docs/howto.html#debugging-generated-classes-in-intellij

Best,
Danny Chan
在 2020年7月22日 +0800 AM12:17，taojin ，写道：
> hi folks,
>
> I has met a compiler error on calcite-core:1.23.0,jdk version is
> :openjdk version "11.0.7" 2020-04-14.
>
> Here is the error log,is this a bug?
>
> Exception in thread "main" java.sql.SQLException: Error while executing
> SQL "select * from test.t": Error while compiling generated Java code:
> public org.apache.calcite.linq4j.Enumerable bind(final
> org.apache.calcite.DataContext root) {
>   final org.apache.calcite.linq4j.function.Function1 rowBuilderFactory
> = new org.apache.calcite.linq4j.function.Function1() {
>     public org.apache.calcite.linq4j.function.Function0 apply(final
> java.sql.ResultSet resultSet) {
>   return new org.apache.calcite.linq4j.function.Function0() {
>   public Object apply() {
>     try {
>   return new Object[0];
>     } catch (java.sql.SQLException e) {
>   throw new RuntimeException(
>     e);
>     }
>   }
>     }
>   ;
>     }
>     public Object apply(final Object resultSet) {
>   return apply(
>     (java.sql.ResultSet) resultSet);
>     }
>   }
>   ;
>   final org.apache.calcite.runtime.ResultSetEnumerable enumerable =
> org.apache.calcite.runtime.ResultSetEnumerable.of((javax.sql.DataSource)
> root.getRootSchema().getSubSchema("test").unwrap(javax.sql.DataSource.class),
> "SELECT *\nFROM hdb.t", rowBuilderFactory);
>   enumerable.setTimeout(root);
>   return enumerable;
> }
>
>
> public Class getElementType() {
>   return org.apache.calcite.runtime.FlatLists.ComparableList.class;
> }
>
>
>
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
>     at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
>     at
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163)
>     at
> org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227)
>     at mtest.App5.main(App5.java:45)
> Caused by: java.lang.RuntimeException: Error while compiling generated
> Java code:
> public org.apache.calcite.linq4j.Enumerable bind(final
> org.apache.calcite.DataContext root) {
>   final org.apache.calcite.linq4j.function.Function1 rowBuilderFactory
> = new org.apache.calcite.linq4j.function.Function1() {
>     public org.apache.calcite.linq4j.function.Function0 apply(final
> java.sql.ResultSet resultSet) {
>   return new org.apache.calcite.linq4j.function.Function0() {
>   public Object apply() {
>     try {
>   return new Object[0];
>     } catch (java.sql.SQLException e) {
>   throw new RuntimeException(
>     e);
>     }
>   }
>     }
>   ;
>     }
>     public Object apply(final Object resultSet) {
>   return apply(
>     (java.sql.ResultSet) resultSet);
>     }
>   }
>   ;
>   final org.apache.calcite.runtime.ResultSetEnumerable enumerable =
> org.apache.calcite.runtime.ResultSetEnumerable.of((javax.sql.DataSource)
> root.getRootSchema().getSubSchema("test").unwrap(javax.sql.DataSource.class),
> "SELECT *\nFROM hdb.t", rowBuilderFactory);
>   enumerable.setTimeout(root);
>   return enumerable;
> }
>
>
> public Class getElementType() {
>   return org.apache.calcite.runtime.FlatLists.ComparableList.class;
> }
>
>
>
>     at org.apache.calcite.avatica.Helper.wrap(Helper.java:37)
>     at
> org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:128)
>     at
> org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:)
>     at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:309)
>     at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:208)
>     at
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:632)
>     at
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:498)
>     at
> org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:468)
>     at
> org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231)
>     at
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:552)
>     at
> org.apache.calcite.avatica.AvaticaConnection.prepa

Re: custom metadata logic

2020-07-21 Thread Danny Chan

The suggested way to set up the RelMetadataProvider is through 
RelOptCluster#setMetadataProvider [1]. If you want to customize the 
RelMetadataQuery, take the document in RelMetadataQueryBase [2] for a reference.

[1] 
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L142
[2] 
https://github.com/apache/calcite/blob/d2c859098acb3210ac5cad90c5ea28a3317f0386/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQueryBase.java#L58

Best,
Danny Chan
在 2020年7月21日 +0800 PM4:18，Tal Glanzman ，写道：
> thank you it is working great!
>
> On Tue, Jul 21, 2020 at 12:45 AM Stamatis Zampetakis 
> wrote:
>
> > Hi Taz,
> >
> > If you are relying on the RelMetadataQuery [1] API then you may need to set
> > your provided into THREAD_PROVIDERS in a similar way that it is done in
> > RelMetadataTest [2].
> >
> > Best,
> > Stamatis
> >
> > [1]
> >
> > https://github.com/apache/calcite/blob/7a462f2b2f78aa12068b691c1e423ea4c8a825e4/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L75
> > [2]
> >
> > https://github.com/apache/calcite/blob/7a462f2b2f78aa12068b691c1e423ea4c8a825e4/core/src/test/java/org/apache/calcite/test/RelMetadataTest.java#L893
> >
> > On Mon, Jul 20, 2020 at 6:15 PM Tal Glanzman 
> > wrote:
> >
> > > hi,
> > >
> > > I am trying to figure out how to add custom logic for providing metadata
> > to
> > > nodes inside my adapter with no luck.
> > >
> > > I have an implementation of my own RelMetadataProvider, as described in
> > the
> > > docs.
> > >
> > > After looking around, i still couldn't find a way to use this provider in
> > > the planning phase
> > > - no property in the jdbc connection string
> > > - tried to invoke setMetadataProvider(provider) on the cluster of the
> > > initial nodes of the tree
> > >
> > > drilling a bit, i notice that the provider in use is
> > > JaninoRelMetadataProvider with provider=DefaultRelMetadataProvider.
> > >
> > > Is there a way to do such a thing using the standard execution?
> > >
> > > any guidance will be much appreciated, thanks!
> > >
> >

Re: Re: [DISCUSS] New Join Type: ANTI_NOTIN

2020-07-20 Thread Danny Chan

If it is only constant NOT IN predicate, how difficult it is to rewrite it into 
a normal composite AND predicate before entering the planning phrase ?

Best,
Danny Chan
在 2020年7月21日 +0800 PM12:35，Haisheng Yuan ，写道：
> Thanks Jinpeng for providing a good example for not in subquery.
>
> I 100% agree with you that correlated query won't be represented by 
> ANTI_NOTIN join type, and it is not the proposal's intention. Here what we 
> are discussing is not to use ANTI_NOTIN to represent all the NOT IN 
> sub-queries, that is impossible. Instead, if you take a close look at the 
> example query, it is a simple uncorrelated NOT IN sub-query. That is the 
> target. Let's focus on that kind of query, ask ourselves this question: Can 
> such a simple query be transformed into a ANTI join to make the plan 
> efficient?
>
> Sadly no. The reality is that this kind of query is not uncommon, may be much 
> more common than correlated NOT IN sub-queries.
>
>
> Reply to Julian:
> > > How about making a sub-query type (in RexSubQuery), so it is gone
> > > before we reach algebra.
> It will be nice to have a NOT_IN subquery type, without expanding NOT IN to 
> NOT(IN).
> However, if there is no ANTI_NOTIN in the join type (without reaching 
> algebra), does that mean the optimizer still can't generate efficient plan 
> for simple NOT IN sub-queries?
>
> > > ANTI_NOTIN is a terrible name. ANTI means 'opposite' to ANTI_NOTIN is
> > > the opposite of NOT IN?!
> It depends how people interpret ANTI. You interpret it as "opposite", I 
> interpret it as "ANTI JOIN", means "anti join for NOT IN, instead of NOT 
> EXISTS". But it is just a naming issue, I am OK to change it whatever name 
> that makes sense to the community, as long as it can convey the meaning.
>
> Thanks,
> Haisheng
>
> On 2020/07/21 03:02:20, Jinpeng Wu  wrote:
> > Hi.
> >
> > In some SQL engine, the query
> > select * from A where c1 not in ( select c1 from B where B.c2 = A.c2);
> > is transformed to a plan like
> > select * from A LEFT ANTI JOIN B on A.c2 = B.c2 AND (A.c1 = B.c1 OR A.c1 is
> > null OR B.c1 is null);
> >
> > Here, the "LEFT ANTI JOIN" is nothing more than traditional definition. One
> > thing seems to be a problem is that A.c1 cannot be used as a join key in
> > the new plan. However, the problem is also there for ANTI_NOTIN, and even
> > other NOT-IN-SUBQUERY physical implementations.
> >
> > Thanks,
> > Qiupeng
> >
> > On Tue, Jul 21, 2020 at 5:30 AM Julian Hyde  wrote:
> >
> > > How about making a sub-query type (in RexSubQuery), so it is gone
> > > before we reach algebra.
> > >
> > > ANTI_NOTIN is a terrible name. ANTI means 'opposite' to ANTI_NOTIN is
> > > the opposite of NOT IN?!
> > >
> > > On Mon, Jul 20, 2020 at 1:00 PM Haisheng Yuan  wrote:
> > > >
> > > > Typo:
> > > > We can just add a security guard saying that it is supported.
> > > > Should be
> > > > We can just add a security guard saying that it is NOT supported.
> > > >
> > > > On 2020/07/20 19:57:34, Haisheng Yuan  wrote:
> > > > > I am not sure I got your implication by "pollute". If you mean
> > > changes, yes, it requires some changes in rules. Do we need to change
> > > enumerables? Not necessary. We can just add a security guard saying that 
> > > it
> > > is supported. Not everyone requires the Enumerable operators to support
> > > everything. More importantly, currently there is no logic or rules to
> > > translate sub-query directly to SEMI/ANTI joins, let alone translating
> > > directly to ANTI_NOTIN. Currently NOT IN is expanded to NOT(IN ...) before
> > > entering RelNode land. That means we don't even have the chance to 
> > > generate
> > > the NOT IN anti join. Is that still a concern?
> > > > >
> > > > > Even if some day, some contributor extends Calcite's parser and
> > > SubqueryRemovalRule to be able to transform NOT_IN subquery into NOT IN
> > > anti join, we still have chance to disable it. Is that still a concern?
> > > > >
> > > > > There are many ways to play it safe.
> > > > >
> > > > > > Brainstorming: maybe we could consider it as a separate logical
> > > operator
> > > > > > (with its corresponding enumerable implementation)?
> > > > > It doesn't sound cool. It requires much more work. You have to
> > > duplicate all the rules, metadata handler

Re: [DISCUSS] Default disable the RexNode normalization(or operands reorder)

2020-07-16 Thread Danny Chan

I added and it passed.

Best,
Danny Chan
在 2020年7月16日 +0800 PM4:35，Vladimir Sitnikov ，写道：
> Danny, could you please add a test case for nestes normalization (see
> comment in pr) ?
>
> Vladimir

Re: [DISCUSS] Default disable the RexNode normalization(or operands reorder)

2020-07-16 Thread Danny Chan

I would commit the PR soon because I want this into 1.24 release, we can 
continue the discussion though ~

Best,
Danny Chan
在 2020年7月16日 +0800 PM2:53，Vladimir Sitnikov ，写道：
> > But currently AND/OR in Calcite is always binary operator
>
> I guess we might want to add multi-arg AND in the future to address
> AND(AND(AND(...))) issues.
> I know IntelliJ IDEA switched to multi-arg "+" and similar representations
> to reduce the complexity of certain operations.
>
> PS. Even though we can have a nice unordered hash function, we still need
> to implement `equals`, and it looks like we need to have normalization
> there.
> Do you suggest we should implement `equals` in such a way that it does not
> need to normalize the tree?
> Is there an approach?
>
> I see currently the implementation is normalizing both expressions every
> time `equals` is called.
> On top of that, it s not clear how it should behave with deep normalization.
>
> Vladimir

Re: [DISCUSS] Default disable the RexNode normalization(or operands reorder)

2020-07-15 Thread Danny Chan

Well, I think I now got your idea, I agree a specific operator sub-class plus a 
special symmetric hash code is more efficient and extensible. But for the first 
version, I would only consider binary operators because I could not figure out 
how to implement an efficient equals for operator like IN.

Best,
Danny Chan
在 2020年7月16日 +0800 AM1:55，Haisheng Yuan ，写道：
> > Customized sql operator can also benefit. [1]
>
> I am not sure if I missed something. Can you show me how can the customized 
> sql operator benefit from this?
> e.g. geospatial operator intersect (it is input order insensitive):
> boolean &&( geometry A , geometry B )
>
> > Add a SqlOperator interface is not that feasible because most of the 
> > operators share the same class
>
> Really? How hard it is to create a SqlOrderInsensitiveBinaryOperator that 
> just override inputOrderSensitive() method? like how 
> SqlMonotonicBinaryOperator does.
>
> > Eagerly computed hashcode for operand should not be a problem because 
> > sooner or later the hashcode would be computed in the digest cache of the 
> > planner and the RexCall already has a hashcode cache.
>
> The thing is do we have to do all the gymnastics like normalization for input 
> order insensitive operator when just computing the hash code? Is it the best 
> we can do?
>
> Currently It is limited to sql operator with 2 operands. What if I have a 
> customized sql operand AND/OR that can accept more than 2 operands?
>
> On 2020/07/15 06:34:55, Danny Chan  wrote:
> > I have extended the logic to support all the symmetrical operators(=, <> 
> > ..) and the binary comparison operators (>=, < ..),
> > not only just RexInputRef. Customized sql operator can also benefit. [1]
> >
> > The way is to compare the operands with SqlKind first then fallback to 
> > their hashcode (eargely computed).
> >
> > Add a SqlOperator interface is not that feasible because most of the 
> > operators share the same class (for example, SqlBinaryOperator) with 
> > different SqlKind. Instead, the current code has a special SqlKind to 
> > distinguish the operators we want to normalize which I think is a better 
> > way.
> >
> > Eagerly computed hashcode for operand should not be a problem because 
> > sooner or later the hashcode would be computed in the digest cache of the 
> > planner and the RexCall already has a hashcode cache.
> >
> > We actually don’t really need to care about what the operands sort 
> > algorithm is, only if:
> >
> >
> > • the algorithm is deterministic
> > • It can cover most of the cases, like the user defined function
> > • It does not affect the performance too much
> >
> > [1]  https://github.com/apache/calcite/pull/2065
> >
> > Best,
> > Danny Chan
> > 在 2020年7月15日 +0800 PM12:10，dev@calcite.apache.org，写道：
> > >
> > > Sql operators like EQUALS, NOT_EQUALS, AND, OR should return false to 
> > > indicate they are not input order sensitive.
> >

Re: [DISCUSS] Default disable the RexNode normalization(or operands reorder)

2020-07-15 Thread Danny Chan

I have extended the logic to support all the symmetrical operators(=, <> ..) 
and the binary comparison operators (>=, < ..),
not only just RexInputRef. Customized sql operator can also benefit. [1]

The way is to compare the operands with SqlKind first then fallback to their 
hashcode (eargely computed).

Add a SqlOperator interface is not that feasible because most of the operators 
share the same class (for example, SqlBinaryOperator) with different SqlKind. 
Instead, the current code has a special SqlKind to distinguish the operators we 
want to normalize which I think is a better way.

Eagerly computed hashcode for operand should not be a problem because sooner or 
later the hashcode would be computed in the digest cache of the planner and the 
RexCall already has a hashcode cache.

We actually don’t really need to care about what the operands sort algorithm 
is, only if:


• the algorithm is deterministic
• It can cover most of the cases, like the user defined function
• It does not affect the performance too much

[1]  https://github.com/apache/calcite/pull/2065

Best,
Danny Chan
在 2020年7月15日 +0800 PM12:10，dev@calcite.apache.org，写道：
>
> Sql operators like EQUALS, NOT_EQUALS, AND, OR should return false to 
> indicate they are not input order sensitive.

Re: [RESULT] [VOTE] Release apache-calcite-avatica-go-5.0.0 (release candidate 0)

2020-07-14 Thread Danny Chan

Thanks for the great work, Francis ~

Best,
Danny Chan
在 2020年7月15日 +0800 AM7:23，dev@calcite.apache.org，写道：
>
> Francis

Re: [DISCUSS] Default disable the RexNode normalization(or operands reorder)

2020-07-13 Thread Danny Chan

Yes, it is. We can keep it as a builtin promotion.

Best,
Danny Chan
在 2020年7月13日 +0800 PM3:48，Vladimir Sitnikov ，写道：
> > Hi, all, I’m planning to default disable the RexNode normalization in
> CALCITE-4073, if you have any objections, please let me know in 24 hours,
> thanks so much ~
>
> I assume it would still normalize RexNodes when building plan digest. Is it
> the case?
>
> Vladimir

Re: [DISCUSS] Default disable the RexNode normalization(or operands reorder)

2020-07-13 Thread Danny Chan

Hi, all, I’m planning to default disable the RexNode normalization in 
CALCITE-4073, if you have any objections, please let me know in 24 hours, 
thanks so much ~

Looking forward to your feedback ~

Best,
Danny Chan
在 2020年7月8日 +0800 PM4:38，Danny Chan ，写道：
> In CALCITE-2450, we proposed a change to normalize the RexNode, and there is 
> a discussion[1], the change is in very early  phrase and the normalization 
> pattern is unstable.
>
> There is actually no common consensus about what a form (or pattern)  a 
> desired normalization should be:
>
> • People may have different requests in different contexts.
> • Different downstream projects may also have different requests
>
> The problem becomes critical after CALCITE-3786 because there are more cases 
> be normalized (about 50+ plan changes). In CALCITE-3786, we move the 
> normalization to constructor because the digest equals and object equals 
> should be equivalent for the RexCalls.
>
> The downstream project like Apache Flink would have much more cases with 
> normalized plans. But actually, the normalization gains little. I think other 
> downstream projects have similar situation.
>
> I would suggest to default disable the normalization until it is “stable” 
> enough, at least, after we have a consensus about what is a normalized 
> pattern should be, there is an issue [3] already and we can have more 
> discussion based on that.
>
> Appreciate for your suggestions, thanks in advance ~
>
> [1] 
> https://lists.apache.org/x/thread.html/54bf3ed733eb7e725ce3ea397334aad8f1323ead13e450b1753b1521@%3Cdev.calcite.apache.org%3E
> [2] https://issues.apache.org/jira/browse/CALCITE-2450
> [3] https://issues.apache.org/jira/browse/CALCITE-4073
>
> Best,
> Danny Chan

Re: [VOTE] Release apache-calcite-avatica-go-5.0.0 (release candidate 0)

2020-07-13 Thread Danny Chan

Docker version 19.03.5, docker-compose version 1.25.4

* Checked signatures and checksums OK
* Release note OK
* docker-compose run test OK (for git tag and source artifacts)
* Checked diff between repo and artifacts OK

+1 (binding)

Best,
Danny Chan
在 2020年7月13日 +0800 AM6:31，dev@calcite.apache.org，写道：
>
> Docker version 19.03.11, docker-compose version 1.25.5, build 8a1c60f6
>
> * Checked signatures and checksums OK
> * Went over release note (Nice to see new contributors) OK
> * Checkout from git tag and run tests (docker-compose run test) OK
> * Download source artifacts and run tests (docker-compose run test) OK
> * Checked diff between repo and artifacts OK
>
> +1 (binding)

Re: Calcite PR CI offten failed due to elasticsearch:test

2020-07-10 Thread Danny Chan

Thanks so much for taking care~

Andrei Sereda 于2020年7月10日 周五上午11:59写道：

> Hello,
>
> I'll take a look at failing tests. It seems that embedded ES doesn't start
> correctly.
>
> Andrei.
>
> On Thu, Jul 9, 2020 at 10:21 PM JiaTao Tao  wrote:
>
> > another case
> > https://travis-ci.org/github/apache/calcite/jobs/706573803
> >
> > FAILURE   0.0sec, org.apache.calcite.test.RelMetadataTest >
> > testPullUpPredicatesFromUnion2()
> > java.lang.AssertionError:
> > Expected: "[=($0, 1), OR(AND(=($1, 2), =($2, 3)), =($1, 4))]"
> >  but: was "[=($0, 1), OR(AND(=($2, 3), =($1, 2)), =($1, 4))]"
> >
> > Regards!
> >
> > Aron Tao
> >
> >
> > JiaTao Tao  于2020年7月9日周四 下午2:46写道：
> >
> > > Hi Stamatis Zampetakis
> > >
> > > I agree with you, on my local test, it is never failing, seems only a
> PR
> > > CI problem.
> > > Do you have any idea about disabling the tests only in PR CI?
> > >
> > > Regards!
> > >
> > > Aron Tao
> > >
> > >
> > > Stamatis Zampetakis  于2020年7月2日周四 下午5:05写道：
> > >
> > >> I didn't observe the problem on my local post so if it is only a CI
> > >> problem
> > >> then it would be nice if we could disable the tests only there.
> > >>
> > >> Best,
> > >> Stamatis
> > >>
> > >> On Thu, Jul 2, 2020 at 5:58 AM JiaTao Tao 
> wrote:
> > >>
> > >> > Ok, thanks for your suggestion. I'll open a JIRA and track the
> status.
> > >> >
> > >> >
> > >> > Regards!
> > >> >
> > >> > Aron Tao
> > >> >
> > >> >
> > >> > Danny Chan  于2020年7月2日周四 上午11:37写道：
> > >> >
> > >> > > 2 and 3 doesn’t really solve the problem, I would choose 1, you
> > should
> > >> > log
> > >> > > these tests in the issue in order to track the status ~
> > >> > >
> > >> > > Best,
> > >> > > Danny Chan
> > >> > > 在 2020年7月2日 +0800 AM11:15，JiaTao Tao ，写道：
> > >> > > > Hi Danny
> > >> > > >
> > >> > > >
> > >> > > > There is 4 failed ut, and the root cause is java.net
> > >> > > .SocketTimeoutException:
> > >> > > > 30,000 milliseconds timeout on connection http-outgoing-2 :
> > >> > > >
> > >> > > > elasticsearch.MatchTest > initializationError
> > >> > > > elasticsearch.BooleanLogicTest > initializationError
> > >> > > > elasticsearch.AggregationTest > initializationError
> > >> > > > elasticsearch.ProjectionTest > initializationError
> > >> > > >
> > >> > > > Here come the 3 solutions, hope to hear your advice.
> > >> > > > 1. Disable these UTs
> > >> > > > 2. Increase the timeout
> > >> > > > 3. Mark these UTs as slow tests
> > >> > > >
> > >> > > >
> > >> > > > Regards!
> > >> > > >
> > >> > > > Aron Tao
> > >> > > >
> > >> > > >
> > >> > > > Danny Chan  于2020年6月28日周日 上午10:01写道：
> > >> > > >
> > >> > > > > Yes, it times out often, can you log an issue there and
> disable
> > >> the
> > >> > > case ?
> > >> > > > >
> > >> > > > > Best,
> > >> > > > > Danny Chan
> > >> > > > > 在 2020年6月26日 +0800 PM6:00，JiaTao Tao ，写道：
> > >> > > > > > Seems an unstable case.
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > Regards!
> > >> > > > > >
> > >> > > > > > Aron Tao
> > >> > > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Re: [DISCUSS] Default disable the RexNode normalization(or operands reorder)

2020-07-10 Thread Danny Chan

I mean default to not make normalization. This feature starts from release
1.22.0.

Stamatis Zampetakis 于2020年7月11日 周六上午7:04写道：

> Hi Danny,
>
> From the initial discussion, I was leaning more towards the idea of having
> a separate component (something like what is described in CALCITE-4073)
> that people can use if they really need to.
> From my experience so far I can confirm that different projects  have
> different needs in terms of normalization.
> Putting this logic in the constructor forces people to follow a specific
> pattern so it would be nice if we could avoid that.
>
> I am not sure I understand what you mean when you say disabling it by
> default. Are you talking about features that are already released or the
> things currently in progress for 1.24.0?
>
> Best,
> Stamatis
>
> On Wed, Jul 8, 2020 at 10:39 AM Danny Chan  wrote:
>
> > In CALCITE-2450, we proposed a change to normalize the RexNode, and there
> > is a discussion[1], the change is in very early  phrase and the
> > normalization pattern is unstable.
> >
> > There is actually no common consensus about what a form (or pattern)  a
> > desired normalization should be:
> >
> > • People may have different requests in different contexts.
> > • Different downstream projects may also have different requests
> >
> > The problem becomes critical after CALCITE-3786 because there are more
> > cases be normalized (about 50+ plan changes). In CALCITE-3786, we move
> the
> > normalization to constructor because the digest equals and object equals
> > should be equivalent for the RexCalls.
> >
> > The downstream project like Apache Flink would have much more cases with
> > normalized plans. But actually, the normalization gains little. I think
> > other downstream projects have similar situation.
> >
> > I would suggest to default disable the normalization until it is “stable”
> > enough, at least, after we have a consensus about what is a normalized
> > pattern should be, there is an issue [3] already and we can have more
> > discussion based on that.
> >
> > Appreciate for your suggestions, thanks in advance ~
> >
> > [1]
> >
> https://lists.apache.org/x/thread.html/54bf3ed733eb7e725ce3ea397334aad8f1323ead13e450b1753b1521@%3Cdev.calcite.apache.org%3E
> > [2] https://issues.apache.org/jira/browse/CALCITE-2450
> > [3] https://issues.apache.org/jira/browse/CALCITE-4073
> >
> > Best,
> > Danny Chan
> >
>

[DISCUSS] Default disable the RexNode normalization(or operands reorder)

2020-07-08 Thread Danny Chan

In CALCITE-2450, we proposed a change to normalize the RexNode, and there is a 
discussion[1], the change is in very early  phrase and the normalization 
pattern is unstable.

There is actually no common consensus about what a form (or pattern)  a desired 
normalization should be:

• People may have different requests in different contexts.
• Different downstream projects may also have different requests

The problem becomes critical after CALCITE-3786 because there are more cases be 
normalized (about 50+ plan changes). In CALCITE-3786, we move the normalization 
to constructor because the digest equals and object equals should be equivalent 
for the RexCalls.

The downstream project like Apache Flink would have much more cases with 
normalized plans. But actually, the normalization gains little. I think other 
downstream projects have similar situation.

I would suggest to default disable the normalization until it is “stable” 
enough, at least, after we have a consensus about what is a normalized pattern 
should be, there is an issue [3] already and we can have more discussion based 
on that.

Appreciate for your suggestions, thanks in advance ~

[1] 
https://lists.apache.org/x/thread.html/54bf3ed733eb7e725ce3ea397334aad8f1323ead13e450b1753b1521@%3Cdev.calcite.apache.org%3E
[2] https://issues.apache.org/jira/browse/CALCITE-2450
[3] https://issues.apache.org/jira/browse/CALCITE-4073

Best,
Danny Chan

Re: [DISCUSS] Make RexNode serializable

2020-07-07 Thread Danny Chan

Serialize the RexNode as Json format is a solution but I’m afraid it can not 
solve the problem completely.
One problem with it is how to re-parse the json format back to RexNode, the 
current RelJsonReader can only re-parse the RelNode but not RexNode, and it 
needs the RelOptSchema to lookup the operators.

In the distributed scenarios of Beam, I’m afraid it is hard to get the 
RelOptSchema because it is execution, we usually see the RelOptSchema during 
SQL compile time.

Best,
Danny Chan
在 2020年7月8日 +0800 AM3:39，Roman Kondakov ，写道：
> Hi Rui,
>
> AFAIK, RelNodes can be serialized to and deserialized from JSON format.
> See test [1] as an example. If I understand it correct, RelNodes are
> serialized along with enclosed RexNodes, so you can transfer them over
> the network as plain strings.
>
> [1]
> https://github.com/apache/calcite/blob/f64cdcbb9f6535650f0227da19640e736496a9c3/core/src/test/java/org/apache/calcite/plan/RelWriterTest.java#L88
>
> --
> Roman Kondakov
>
> On 07.07.2020 22:13, Enrico Olivelli wrote:
> > Rui
> >
> > Il Mar 7 Lug 2020, 20:30 Rui Wang  ha scritto:
> >
> > > Hi Community,
> > >
> > > In Apache Beam we are facing a use case where we need to keep RexNode in
> > > our distributed primitives. Because of the nature of distributed 
> > > computing,
> > > Beam requires the usage of those primitives be serializable (thus those
> > > primitives can be sent over the network to backend/workers for
> > > further execution).
> > >
> > > In the Java world this requirement means to make RexNode implement the 
> > > Java
> > > Serializable interface.
> > >
> > > A workaround right now is to create a bunch of classes to "clone" RexNode
> > > while making those classes implement the Serializable interface.
> > >
> >
> > Did you evaluate to use some framework like Kryo that allows you to
> > serialize Jon serializable classes?
> >
> > I think that in general Java serialisation is not efficient as it is too
> > general purpose.
> > It also brings in a few Security issues.
> >
> > Maybe an alternative idea is to add some serialisation ad-hoc mechanism in
> > RexNode.
> > We should also ensure that every RexNode will be able to be serialized and
> > deserialized.
> >
> > Enrico
> >
> >
> > > So what do you think of the idea that makes RexNode implement the
> > > Serializable interface?
> > >
> > >
> > > -Rui
> > >
> >

Re: Calcite PR CI offten failed due to elasticsearch:test

2020-07-01 Thread Danny Chan

2 and 3 doesn’t really solve the problem, I would choose 1, you should log 
these tests in the issue in order to track the status ~

Best,
Danny Chan
在 2020年7月2日 +0800 AM11:15，JiaTao Tao ，写道：
> Hi Danny
>
>
> There is 4 failed ut, and the root cause is java.net.SocketTimeoutException:
> 30,000 milliseconds timeout on connection http-outgoing-2 :
>
> elasticsearch.MatchTest > initializationError
> elasticsearch.BooleanLogicTest > initializationError
> elasticsearch.AggregationTest > initializationError
> elasticsearch.ProjectionTest > initializationError
>
> Here come the 3 solutions, hope to hear your advice.
> 1. Disable these UTs
> 2. Increase the timeout
> 3. Mark these UTs as slow tests
>
>
> Regards!
>
> Aron Tao
>
>
> Danny Chan  于2020年6月28日周日 上午10:01写道：
>
> > Yes, it times out often, can you log an issue there and disable the case ?
> >
> > Best,
> > Danny Chan
> > 在 2020年6月26日 +0800 PM6:00，JiaTao Tao ，写道：
> > > Seems an unstable case.
> > >
> > >
> > > Regards!
> > >
> > > Aron Tao
> >

Re: Fwd: Announcing Apachecon @Home 2020

2020-06-30 Thread Danny Chan

That’s true, look forward to the new talk ~

Best,
Danny Chan
在 2020年6月29日 +0800 PM9:04，Michael Mior ，写道：
> This could be a great opportunity for anyone interested in giving a
> talk about Calcite and normally unable to attend ApacheCon.
>
> --
> Michael Mior
> mm...@apache.org
>
>
> -- Forwarded message -
> De : Rich Bowen 
> Date: lun. 29 juin 2020 à 09:01
> Subject: Announcing Apachecon @Home 2020
> To: plann...@apachecon.com 
>
>
> Dear PMCs,
>
> We’re pleased to announce that ApacheCon 2020 will be held online,
> September 29th through October 1st, 2020. You can see the full details
> in the various other announcements that have been going out to other
> lists you’re on, so I won’t repeat all of that again here.
>
> My message for you is that, due to the online nature of the event, we’re
> freed of the normal space/time constraints that we have to deal with at
> in-person events. We can have pretty much as much content as we want, on
> as many topics as we want.
>
> If your project (or family of projects) wants to have a presence at
> ApacheCon, from one “what is it?” talk all the way up to a full track of
> content, please get in touch with me - rbo...@apache.org - as soon as
> possible to arrange that.
>
> We are on a very short timeline, due to a long list of blockers that we
> have had to deal with over the past few months, and so please don’t
> delay. I will need for you to designate a track chair who will be my
> single point of contact for your track. You can handle communication
> with your project however you like, but I need one person that I can
> rely on to be that point of contact. That track chair should subscribe
> to plann...@apachecon.com in order to be included in all relevant
> conversations
>
> I hope to have helpful documentation for track chairs, so you won’t be
> doing it all on your own.
>
> I really hope to see your project at ApacheCon @Home 2020.
>
> Rich Bowen
> VP Conferences, The Apache Software Foundation
> --
> Michael Mior
> mm...@apache.org

Re: Help to parse spark dialect

2020-06-27 Thread Danny Chan

The Apache Flink sql parser template is a good choice to reference [1].

[1] 
https://github.com/apache/flink/blob/master/flink-table/flink-sql-parser/src/main/codegen/includes/parserImpls.ftl

Best,
Danny Chan
在 2020年6月27日 +0800 AM4:26，Tang, Mingjun ，写道：
> Hi dear calcite community,
>
> This is a developer from eBay, I’m trying to parse and optimize spark sql 
> with calcite, but there’re many grammars not supported by default calcite 
> parser.jj, such as ‘create or replace temporary view’ or ‘insert overwrite’, 
> is there a template file I can use to parse spark sql?
>
> Best Regards,
> Wesley

Re: Calcite PR CI offten failed due to elasticsearch:test

2020-06-27 Thread Danny Chan

Yes, it times out often, can you log an issue there and disable the case ?

Best,
Danny Chan
在 2020年6月26日 +0800 PM6:00，JiaTao Tao ，写道：
> Seems an unstable case.
>
>
> Regards!
>
> Aron Tao

Re: [ANNOUNCE] Apache Calcite Avatica 1.17.0 released

2020-06-22 Thread Danny Chan

Thanks for the work, Francis~

Francis Chuang 于2020年6月23日 周二上午8:10写道：

> The Apache Calcite team is pleased to announce the release of Apache
> Calcite Avatica 1.17.0.
>
> Avatica is a framework for building database drivers. Avatica defines a
> wire API and serialization mechanism for clients to communicate with a
> server as a proxy to a database. The reference Avatica client and server
> are implemented in Java and communicate over HTTP. Avatica is a
> sub-project of Apache Calcite
>
> Apache Calcite Avatica 1.17.0 is a small routine release that includes a
> few dependency updates and bug fixes. Of particular note is a fix to
> resolve a multilingual decoding issue that parses Chinese characters as
> gibberish. This release has also been tested with JDK 14 and Guava 29.
> For a full list of changes, please see the release notes:
>
>https://calcite.apache.org/avatica/docs/history.html#v1-17-0
>
> The release is available here:
>
>https://calcite.apache.org/avatica/downloads/avatica.html
>
> We welcome your help and feedback. For more information on how to report
> problems and get involved, visit the project website at:
>
> https://calcite.apache.org/avatica/
>
> or the Apache Calcite project website:
>
> https://calcite.apache.org/
>
> Thanks to everyone involved!
>
> Francis Chuang, on behalf of the Apache Calcite team.
>

Re: [RESULT] [VOTE] Release apache-calcite-avatica-1.17.0 (release candidate 0)

2020-06-21 Thread Danny Chan

Thanks for the great work, Francis ~

Best,
Danny Chan
在 2020年6月22日 +0800 AM7:11，Francis Chuang ，写道：
> Thanks to everyone who has tested the release candidate and given their
> comments and votes.
>
> The tally is as follows.
>
> 3 binding +1s:
> Francis Chuang
> Julian Hyde
> Stamatis Zampetakis
>
> No 0s or -1s.
>
> Therefore, I am delighted to announce that the proposal to release
> Apache Calcite Avatica 1.17.0 has passed.
>
> Thanks everyone. We’ll now roll the release out to the mirrors.
>
> Francis

Re: Nested data handling in Caclite

2020-06-19 Thread Danny Chan

I tried your patch and I think you are right, there is some missing feature for 
RelStructuredTypeFlattener,
When the field you want to project is a struct but not at top level, an error 
throws.

I tried to close the flattening and it works well, originally the 
RelStructuredTypeFlattener was designed to extract nested fields so that the 
work like de-correlation is much easier, but it also makes the plan changed and 
hard to maintain, it is not updated frequently, if possible, close the 
flattening.

Best,
Danny Chan
在 2020年6月19日 +0800 AM4:39，Slim Bouguerra ，写道：
> @Danny it is attached to the case CALCITE-4065
>
> https://jira.apache.org/jira/secure/attachment/13005815/13005815_test_cases_CALCITE-4065.patch
> Thanks
>
> On Thu, Jun 18, 2020 at 12:29 AM Danny Chan  wrote:
>
> > What diff, I didn’t see that ~
> >
> > Best,
> > Danny Chan
> > 在 2020年6月16日 +0800 PM11:52，Slim Bouguerra ，写道：
> > > Hi Danny I have run some test yesterday with
> > RelToSqlConverterStructsTest I have attached the diff, let me know what you
> > think
> > >
> > >
> > > > On Tue, Jun 16, 2020 at 2:22 AM Danny Chan 
> > wrote:
> > > > > Take SqlToRelConverterTest#testAliasUnnestArrayPlanWithSingleColumn
> > for an example, you should make every record type with
> > StructKind.PEEK_FIELDS, so that nested record type can be also accessed
> > with DOT.
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > > > 在 2020年6月16日 +0800 PM12:50，Slim Bouguerra ，写道：
> > > > > > Hi Danny,
> > > > > > Thanks for the suggestion, but that did not solve the problem,
> > still
> > > > > > getting the same exception, Not sure If I am missing something ?
> > Do you
> > > > > > have an example of this usage ?
> > > > > > Again the goal here is to select a Row for a Row as an example
> > this is the
> > > > > > column type sketch
> > > > > > outerRow(address_kind, address_inner_row(ZipCode, StreetNum,))
> > > > > > SELECT outerRow.address_inner_row FROM table.
> > > > > >
> > > > > > FYI select outerRow.address_kind works because it is a scalar and
> > after
> > > > > > adding your suggestion I see that select address_kind from table.
> > > > > >
> > > > > > On Mon, Jun 15, 2020 at 7:21 PM Danny Chan 
> > wrote:
> > > > > >
> > > > > > > Hi, when you create a structure type, you should choose
> > > > > > > StructKind.PEEK_FIELDS instead, which let you to access the
> > nested fields
> > > > > > > with DOT, i.e. “a.b.c”.
> > > > > > >
> > > > > > > Best,
> > > > > > > Danny Chan
> > > > > > > 在 2020年6月16日 +0800 AM4:21，Slim Bouguerra ，写道：
> > > > > > > > I am using this thread since the question seems related.
> > > > > > > > As of now I can not say a way to project a nested record (FYI
> > scalar
> > > > > > > > works). https://jira.apache.org/jira/browse/CALCITE-4065
> > > > > > > > @Igor any idea if this can be done without major work on the
> > > > > > > > SqlRelToRelConverter ?
> > > > > > > > Also I am thinking about turning off the flatten stage but not
> > sure this
> > > > > > > is
> > > > > > > > going to happen (seems like a pandora box kind of flag where
> > you do not
> > > > > > > > know what to expect)
> > > > > > > >
> > > > > > > > On Thu, Oct 24, 2019 at 3:53 AM Igor Guzenko <
> > ihor.huzenko@gmail.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello Naveen,
> > > > > > > > >
> > > > > > > > > 1. If I understand correctly, then yes you can extract
> > nested fields
> > > > > > > from
> > > > > > > > > struct type. The syntax depends on StructKind value for your
> > data type,
> > > > > > > > > for example for FULLY_QUALIFIED struct you should first
> > > > > > > > > make alias for your table and then request nested field like,
> > > > > > > > > table_alias.struct_column.nested_field. In rel tree such
>

Re: How to support case-insensitive identifier resolution?

2020-06-18 Thread Danny Chan

Thanks Viliam ~

Usually we use the quote character to avoid conflicts between the reserved 
keywords, i.e. DEFAULT is a reserved keyword in many vendors, if you want to 
use it as a column name or object name, then quoting is a choice.

The “create table …” example you gave seems problematic because the DDL and DML 
case-sensitivity should keep the same, thus, when you create the second table 
`MyTable`, the catalog show throw instead.

Best,
Danny Chan
在 2020年6月18日 +0800 PM10:02，Viliam Durina ，写道：
> > Does any SQL vendors behave like that ?
> I don't know of any vendor.
>
> > What do you mean by “sometimes"
> Quoted identifiers are sensitive and unquoted are not - that's what I meant
> that identifiers are "sometimes" case-sensitive.
>
> > SQL must have a deterministic semantic I think.
> You can define a deterministic rule to decide which object to match, but
> it's just weird: if you compare identifiers to object names in a
> case-insensitive way, then having `MyTable` and `MYTABLE` should not be
> allowed. This weird situation can happen:
> - you create `MYTABLE` object
> - you have a query `SELECT * FROM MyTable` in your application
> - later you create `MyTable` object. It succeeds because it has a distinct
> name
> - the query now reads from a different table, even though the original
> object wasn't changed, just a new, distinct one was created.
>
> My point is that the behavior originally requested probably isn't a good
> idea.
>
> Regards,
> Viliam
>
> On Thu, 18 Jun 2020 at 10:41, Danny Chan  wrote:
>
> > > when the identifier was
> > > quoted and case-insensitive, if it wasn’t
> >
> > Does any SQL vendors behave like that ? What do you mean by “sometimes”,
> > SQL must have a deterministic semantic I think.
> >
> > Best,
> > Danny Chan
> > 在 2020年6月18日 +0800 PM4:23，dev@calcite.apache.org，写道：
> > >
> > > when the identifier was
> > > quoted and case-insensitive, if it wasn't
> >
>
>
> --
> Viliam Durina
> Jet Developer
> hazelcast®
>
> <https://www.hazelcast.com> 2 W 5th Ave, Ste 300 | San Mateo, CA 94402 |
> USA
> +1 (650) 521-5453 | hazelcast.com <https://www.hazelcast.com>
>
> --
> This message contains confidential information and is intended only for the
> individuals named. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately by e-mail if you have received this e-mail by mistake and
> delete this e-mail from your system. E-mail transmission cannot be
> guaranteed to be secure or error-free as information could be intercepted,
> corrupted, lost, destroyed, arrive late or incomplete, or contain viruses.
> The sender therefore does not accept liability for any errors or omissions
> in the contents of this message, which arise as a result of e-mail
> transmission. If verification is required, please request a hard-copy
> version. -Hazelcast

Re: How to support case-insensitive identifier resolution?

2020-06-18 Thread Danny Chan

> when the identifier was
> quoted and case-insensitive, if it wasn’t

Does any SQL vendors behave like that ? What do you mean by “sometimes”, SQL 
must have a deterministic semantic I think.

Best,
Danny Chan
在 2020年6月18日 +0800 PM4:23，dev@calcite.apache.org，写道：
>
> when the identifier was
> quoted and case-insensitive, if it wasn't

Re: Nested data handling in Caclite

2020-06-18 Thread Danny Chan

What diff, I didn’t see that ~

Best,
Danny Chan
在 2020年6月16日 +0800 PM11:52，Slim Bouguerra ，写道：
> Hi Danny I have run some test yesterday with RelToSqlConverterStructsTest I 
> have attached the diff, let me know what you think
>
>
> > On Tue, Jun 16, 2020 at 2:22 AM Danny Chan  wrote:
> > > Take SqlToRelConverterTest#testAliasUnnestArrayPlanWithSingleColumn for 
> > > an example, you should make every record type with 
> > > StructKind.PEEK_FIELDS, so that nested record type can be also accessed 
> > > with DOT.
> > >
> > > Best,
> > > Danny Chan
> > > 在 2020年6月16日 +0800 PM12:50，Slim Bouguerra ，写道：
> > > > Hi Danny,
> > > > Thanks for the suggestion, but that did not solve the problem, still
> > > > getting the same exception, Not sure If I am missing something ? Do you
> > > > have an example of this usage ?
> > > > Again the goal here is to select a Row for a Row as an example this is 
> > > > the
> > > > column type sketch
> > > > outerRow(address_kind, address_inner_row(ZipCode, StreetNum,))
> > > > SELECT outerRow.address_inner_row FROM table.
> > > >
> > > > FYI select outerRow.address_kind works because it is a scalar and after
> > > > adding your suggestion I see that select address_kind from table.
> > > >
> > > > On Mon, Jun 15, 2020 at 7:21 PM Danny Chan  wrote:
> > > >
> > > > > Hi, when you create a structure type, you should choose
> > > > > StructKind.PEEK_FIELDS instead, which let you to access the nested 
> > > > > fields
> > > > > with DOT, i.e. “a.b.c”.
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > > > 在 2020年6月16日 +0800 AM4:21，Slim Bouguerra ，写道：
> > > > > > I am using this thread since the question seems related.
> > > > > > As of now I can not say a way to project a nested record (FYI scalar
> > > > > > works). https://jira.apache.org/jira/browse/CALCITE-4065
> > > > > > @Igor any idea if this can be done without major work on the
> > > > > > SqlRelToRelConverter ?
> > > > > > Also I am thinking about turning off the flatten stage but not sure 
> > > > > > this
> > > > > is
> > > > > > going to happen (seems like a pandora box kind of flag where you do 
> > > > > > not
> > > > > > know what to expect)
> > > > > >
> > > > > > On Thu, Oct 24, 2019 at 3:53 AM Igor Guzenko 
> > > > > >  > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hello Naveen,
> > > > > > >
> > > > > > > 1. If I understand correctly, then yes you can extract nested 
> > > > > > > fields
> > > > > from
> > > > > > > struct type. The syntax depends on StructKind value for your data 
> > > > > > > type,
> > > > > > > for example for FULLY_QUALIFIED struct you should first
> > > > > > > make alias for your table and then request nested field like,
> > > > > > > table_alias.struct_column.nested_field. In rel tree such 
> > > > > > > expressions
> > > > > are
> > > > > > > presented as RexCall with SqlItemOperator operator.
> > > > > > > 2. Yes, this ability was implemented in CALCITE-3138 [1]. It 
> > > > > > > builds
> > > > > call to
> > > > > > > ROW type constructor function on top of flattened tree for 
> > > > > > > necessary
> > > > > > > columns.
> > > > > > > 3. Yes, examples of such functions are ROW(...), ANY_VALUE(...) 
> > > > > > > etc.
> > > > > > >
> > > > > > > In current implementation of flattener invocation of ROW 
> > > > > > > constructor
> > > > > > > function is done despite of null handling same issue exists for 
> > > > > > > some
> > > > > > > aggregate function flattening, like COUNT(struct_column).
> > > > > > > Proper null handling is real pain for flattener, original idea 
> > > > > > > was to
> > > > > > > handle special null indicator for each flattened struct, but in
> > > > &g

Re: Nested data handling in Caclite

2020-06-16 Thread Danny Chan

Take SqlToRelConverterTest#testAliasUnnestArrayPlanWithSingleColumn for an 
example, you should make every record type with StructKind.PEEK_FIELDS, so that 
nested record type can be also accessed with DOT.

Best,
Danny Chan
在 2020年6月16日 +0800 PM12:50，Slim Bouguerra ，写道：
> Hi Danny,
> Thanks for the suggestion, but that did not solve the problem, still
> getting the same exception, Not sure If I am missing something ? Do you
> have an example of this usage ?
> Again the goal here is to select a Row for a Row as an example this is the
> column type sketch
> outerRow(address_kind, address_inner_row(ZipCode, StreetNum,))
> SELECT outerRow.address_inner_row FROM table.
>
> FYI select outerRow.address_kind works because it is a scalar and after
> adding your suggestion I see that select address_kind from table.
>
> On Mon, Jun 15, 2020 at 7:21 PM Danny Chan  wrote:
>
> > Hi, when you create a structure type, you should choose
> > StructKind.PEEK_FIELDS instead, which let you to access the nested fields
> > with DOT, i.e. “a.b.c”.
> >
> > Best,
> > Danny Chan
> > 在 2020年6月16日 +0800 AM4:21，Slim Bouguerra ，写道：
> > > I am using this thread since the question seems related.
> > > As of now I can not say a way to project a nested record (FYI scalar
> > > works). https://jira.apache.org/jira/browse/CALCITE-4065
> > > @Igor any idea if this can be done without major work on the
> > > SqlRelToRelConverter ?
> > > Also I am thinking about turning off the flatten stage but not sure this
> > is
> > > going to happen (seems like a pandora box kind of flag where you do not
> > > know what to expect)
> > >
> > > On Thu, Oct 24, 2019 at 3:53 AM Igor Guzenko  > >
> > > wrote:
> > >
> > > > Hello Naveen,
> > > >
> > > > 1. If I understand correctly, then yes you can extract nested fields
> > from
> > > > struct type. The syntax depends on StructKind value for your data type,
> > > > for example for FULLY_QUALIFIED struct you should first
> > > > make alias for your table and then request nested field like,
> > > > table_alias.struct_column.nested_field. In rel tree such expressions
> > are
> > > > presented as RexCall with SqlItemOperator operator.
> > > > 2. Yes, this ability was implemented in CALCITE-3138 [1]. It builds
> > call to
> > > > ROW type constructor function on top of flattened tree for necessary
> > > > columns.
> > > > 3. Yes, examples of such functions are ROW(...), ANY_VALUE(...) etc.
> > > >
> > > > In current implementation of flattener invocation of ROW constructor
> > > > function is done despite of null handling same issue exists for some
> > > > aggregate function flattening, like COUNT(struct_column).
> > > > Proper null handling is real pain for flattener, original idea was to
> > > > handle special null indicator for each flattened struct, but in
> > practice I
> > > > recognized that it's really hard to deal with flattened fields indices
> > when
> > > > related methods are called from very different points, so for now the
> > > > problem remains unsolved.
> > > > If you can't avoid dealing with null values in your struct columns you
> > > > could try to avoid invocation to SqlToRelConverter.flattenTypes(...)
> > and
> > > > check whether final plan acceptable for you. As far as I know
> > > > there is no reading material for given topic, you can investigate
> > source
> > > > code by debugging RelStructuredTypeFlattener and reading some related
> > plans
> > > > in SqlToRelConverterTest.java and SqlToRelConverterTest.xml.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/CALCITE-3138
> > > >
> > > > Thanks,
> > > > Igor
> > > >
> > > > On Thu, Oct 24, 2019 at 12:57 PM Naveen Kumar
> > > >  wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I work at Flipkart, we are using Calcite in our streaming platform.
> > In
> > > > most
> > > > > of our use cases, input data is nested. I understand Calcite flattens
> > > > > structs in scan and references fields positionally.
> > > > >
> > > > > I had a few questions on handling nested data -
> > > > >
> > > > &g

Re: How to support case-insensitive identifier resolution?

2020-06-16 Thread Danny Chan

How did you use the Calcite parser and validator/converter?

If you use the code directly, you should config the Lex[1] correctly for the 
SqlParser.Config in FrameworkConfig (i.e. make the caseSensitive false).

The CalcteCatalogReader would then initialize from this FrameworkConfig and it 
would resolve the identifiers base on the config.

If you use through the Avatica JDBC driver, config the parameters, something 
like this:

jdbc:calcite:caseSensitive=false ...

[1] 
https://github.com/apache/calcite/blob/69f25863f5f4197c17927a39a82cbf1cffd12b80/core/src/main/java/org/apache/calcite/config/Lex.java#L37

Best,
Danny Chan
在 2020年6月16日 +0800 PM5:05，Vladimir Ozerov ，写道：
> Hi colleagues,
>
> I am trying to implement case insensitive resolution of column/table/schema
> names for our Apache Calcite integration in Hazelcast and got stuck. I hope
> that the community might help me.
>
> Consider that we have a table source that contains the following Java class:
>
> class Employee {
> LocalDate birthDate
> }
>
> I would like to have an SQL engine that could resolve identifiers in
> accordance with ANSI SQL standard, which basically says that unquoted
> identifiers should be compared in a case-insensitive way, while quoted
> identifiers should be compared in a case-sensitive way. Let's focus on
> columns only for now:
>
> SELECT birthDate FROM employee // OK
> SELECT birthdate FROM employee // OK
> SELECT BIRTHDATE FROM employee // OK
> SELECT `birthDate` FROM employee // OK
> SELECT `birthdate` FROM employee // Fail
> SELECT `BIRTHDATE` FROM employee // Fail
>
> That is, my source is a collection of Java objects, and the natural name of
> the column is "birthDate". But I would like it to be accessible as
> "birthDate", birthDate, BIRTHDate, etc.
>
> My problem comes from the fact that casing configuration is applied during
> parsing and by the time the table is asked for a column, the information
> whether the user request was quoted or not is lost. Consider that I have a
> table RelDataType["birthDate"]. Now consider what happens with different
> combinations of casing configuration:
> 1) [unquoted=UNCHANGED, quoted=UNCHANGED]: "SELECT BIRTHDATE" doesn't work
> obviously
> 2) [unquoted=UPPER, quoted=UNCHANGED]: "SELECT BIRTHDATE" doesn't work
> again, because parser normalizes unqouted identifier to upper case, but
> RelDataType has a column "birthDate"
> 3) Same as p.2, but with manual normalization of RelDataType
> to RelDataType["BIRTHDATE"]: "SELECT BIRTHDATE" works now, but "SELECT
> `birthDate`" don't!
>
> Is there any built-in solution to the above problem?
>
> Regards,
> Vladimir.

Re: Nested data handling in Caclite

2020-06-15 Thread Danny Chan

Hi, when you create a structure type, you should choose StructKind.PEEK_FIELDS 
instead, which let you to access the nested fields with DOT, i.e. “a.b.c”.

Best,
Danny Chan
在 2020年6月16日 +0800 AM4:21，Slim Bouguerra ，写道：
> I am using this thread since the question seems related.
> As of now I can not say a way to project a nested record (FYI scalar
> works). https://jira.apache.org/jira/browse/CALCITE-4065
> @Igor any idea if this can be done without major work on the
> SqlRelToRelConverter ?
> Also I am thinking about turning off the flatten stage but not sure this is
> going to happen (seems like a pandora box kind of flag where you do not
> know what to expect)
>
> On Thu, Oct 24, 2019 at 3:53 AM Igor Guzenko 
> wrote:
>
> > Hello Naveen,
> >
> > 1. If I understand correctly, then yes you can extract nested fields from
> > struct type. The syntax depends on StructKind value for your data type,
> > for example for FULLY_QUALIFIED struct you should first
> > make alias for your table and then request nested field like,
> > table_alias.struct_column.nested_field. In rel tree such expressions are
> > presented as RexCall with SqlItemOperator operator.
> > 2. Yes, this ability was implemented in CALCITE-3138 [1]. It builds call to
> > ROW type constructor function on top of flattened tree for necessary
> > columns.
> > 3. Yes, examples of such functions are ROW(...), ANY_VALUE(...) etc.
> >
> > In current implementation of flattener invocation of ROW constructor
> > function is done despite of null handling same issue exists for some
> > aggregate function flattening, like COUNT(struct_column).
> > Proper null handling is real pain for flattener, original idea was to
> > handle special null indicator for each flattened struct, but in practice I
> > recognized that it's really hard to deal with flattened fields indices when
> > related methods are called from very different points, so for now the
> > problem remains unsolved.
> > If you can't avoid dealing with null values in your struct columns you
> > could try to avoid invocation to SqlToRelConverter.flattenTypes(...) and
> > check whether final plan acceptable for you. As far as I know
> > there is no reading material for given topic, you can investigate source
> > code by debugging RelStructuredTypeFlattener and reading some related plans
> > in SqlToRelConverterTest.java and SqlToRelConverterTest.xml.
> >
> > [1] https://issues.apache.org/jira/browse/CALCITE-3138
> >
> > Thanks,
> > Igor
> >
> > On Thu, Oct 24, 2019 at 12:57 PM Naveen Kumar
> >  wrote:
> >
> > > Hi,
> > >
> > > I work at Flipkart, we are using Calcite in our streaming platform. In
> > most
> > > of our use cases, input data is nested. I understand Calcite flattens
> > > structs in scan and references fields positionally.
> > >
> > > I had a few questions on handling nested data -
> > >
> > > 1. Can RelNode DAG work with nested data (instead of flattened fields)
> > > by referencing fields through their nested structure eg,
> > > data.order.orderId
> > > 2. In the current flattened behavior, can output of a query be a
> > struct.
> > > Eg if *orderId, orderData.timestamp, orderData.category* are output of
> > > select query, can I declaratively organise output to below json
> > > structure -
> > > 1.
> > >
> > >
> > >
> > >
> > >
> > > *{ "orderId": "order1", "orderData": { "timestamp": 1571904384814,
> > > "category": "shoes" } }*
> > > 3. Can output of a UDF be struct type
> > >
> > > Please point me to any reading material or example that would help with
> > > these questions.
> > >
> > > Regards,
> > > Naveen
> > >
> > > --
> > >
> > >
> > >
> > >
> > >
> > *-*
> > >
> > >
> > > *This email and any files transmitted with it are confidential and
> > > intended solely for the use of the individual or entity to whom they are
> > > addressed. If you have received this email in error, please notify the
> > > system manager. This message contains confidential information and is
> > > intended only for the individual named. If you are not the named
> > > addressee,
> > > you should not disseminate, distribute or copy this email. Please notify
> > > the sender immedia

Re: Why migrate from Maven to Gradle

2020-06-14 Thread Danny Chan

Thanks, are there some building functionalities that grade can do but maven can 
not ? Gradle is not that user friendly for new uses but I works well, it has 
better flexibility than maven with many custom plugins.

Best,
Danny Chan
在 2020年6月13日 +0800 AM1:31，丁小冬(比古) ，写道：
>
> Hi, Calcite developers
>
> I would like to build calcite with Maven, but now calcite only support gradle 
> build. I have three questions:
> 1. Is there any reason that Calcite had to migrate from maven to gradle?
> 2. If I still insist on using maven to build calcite, is there any risk?
> 3. If I want to migrate from gradle back to maven, is there any quick and 
> correct method to do this?
>
> thank you for your attention, this is really important to me, looking forward 
> to your reply
>
> Great thanks
> Winter Ding

Re: NULL in IN clause

2020-06-05 Thread Danny Chan

Thanks Maytas ~

You requests are reasonable, i think line [1] can be promoted, we can move the 
null literals comparison to the last of composition predicates (OR/AND) instead 
of forbidden all the IN to OR(AND) conversion if the IN value list contains 
nulls.
Can you log an issue there so the one that are interested in it would 
contribute ~

Thanks again ~

[1] 
https://github.com/apache/calcite/blob/feae6fbc328e3a7c87693951d1623f8b47ccea59/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1102

Best,
Danny Chan
在 2020年6月6日 +0800 AM5:38，Rui Wang ，写道：
> I see. I was confused on when dim is null, how will null be equal to null.
> Did a quick check and Calcite seems to return Unknown for both "null =
> null" and "null = false". I am not sure it means for OR, Unknown is
> treated as false thus your example can be simplified.
>
>
> -Rui
>
> On Fri, Jun 5, 2020 at 2:27 PM Rui Wang  wrote:
>
> > Per the commit history I think you can find something in
> > https://issues.apache.org/jira/browse/CALCITE-373.
> >
> > Meanwhile, "dim='a' OR dim=null" can be simplified to dim='a'?
> >
> >
> >
> > -Rui
> >
> > On Fri, Jun 5, 2020 at 2:13 PM Maytas Monsereenusorn 
> > wrote:
> >
> > > Hi Calcite,
> > >
> > > I am wondering why we convert a value list with NULL into an inline table.
> > > The code is in SqlToRelConverter, where we call
> > > !containsNullLiteral(valueList). For example, if I have a SQL select *
> > > from
> > > druid.foo where dim in ('a', null) then why do we not convert this to
> > > dim='a' OR dim=null.
> > > Furthermore, this then can be simply to just dim='a' since dim=null is
> > > always null.
> > >
> > > The plan I got after converting SqlNode to RelNode:
> > > LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4],
> > > m1=[$5], m2=[$6], unique_dim1=[$7])
> > > LogicalJoin(condition=[=($3, $8)], joinType=[inner])
> > > LogicalTableScan(table=[[druid, foo]])
> > > LogicalAggregate(group=[{0}])
> > > LogicalValues(tuples=[[{ 'a' }, { null }]])
> > > however, i expect:
> > > LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4],
> > > m1=[$5], m2=[$6], unique_dim1=[$7])
> > > LogicalFilter(condition=[OR(=($3, 'a'), =($3, null))])
> > > LogicalTableScan(table=[[druid, foo]])
> > >
> > > or something like:
> > > LogicalProject(__time=[$0], cnt=[$1], dim1=[$2], dim2=[$3], dim3=[$4],
> > > m1=[$5], m2=[$6], unique_dim1=[$7])
> > > LogicalFilter(condition=[=($3, 'a')])
> > > LogicalTableScan(table=[[druid, foo]])
> > >
> > > Thanks!
> > > -Maytas
> > >
> >

[DISCUSS] Some quesitons about the new introduced window table functions (TUMBLE, HOP and SESSION)

2020-06-04 Thread Danny Chan

Hi, all, in the last 2 releases, Rui Wang has contributed the window table 
functions to replace the legacy group window syntax, the idea comes from the 
SQL-2016 polymorphic table functions. But the current implementation also got 
some feedbacks(or confusion), this thread tries to have a discussion on theses 
questions that are undecided.

One is about the window function argument names:

=== option1
TUMBLE(table, DESCRIPTOR(datetime), interval, offset)
HOP(table, DESCRIPTOR(time_column), slide, size)

=== option2
TUMBLE(table, DESCRIPTOR(time_column), size, offset)
HOP(table, DESCRIPTOR(time_column), size, hop_size, offset)

I would prefer for option2 for the reasons pointed out by Viliam in 
CALCITE-3737.


Another is about how to translate the query

For query, select * from TUMBLE(table, DESCRIPTOR(time_column), interval);

Currently our outputs plan is:

LogicalProject(ORDERID=[$0], ROWTIME=[$1], window_start=[$2], window_end=[$3])
LogicalTableFunctionScan(invocation=[TUMBLE(DESCRIPTOR($1), interal:INTERVAL 
MINUTE)], rowType=[RecordType(INTEGER ORDERID, TIMESTAMP(0) time_column, 
TIMESTAMP(0) window_start, TIMESTAMP(0) window_end)])
LogicalProject(ORDERID=[$0], ROWTIME=[$1])
LogicalTableScan(table=[[CATALOG, SALES, SHIPMENTS]])

But this seems not right, because the "table table_name" argument is not a 
relational expression, in CALCITE-3955, Julian has pointed out that we should 
translate the "table" argument as a CURSOR (or TABLE by Viliam), but another 
question is how to translate the referenced name "time_column", to a correlate 
variable ? which is also confusing because there is no correlation in the plan 
actually.

Any suggestions are appreciated !


[1] https://issues.apache.org/jira/browse/CALCITE-3955
[2] https://issues.apache.org/jira/browse/CALCITE-3737

Best,
Danny Chan

Re: Using indexes rather than table scans with Calcite

2020-05-29 Thread Danny Chan

Calcite does support table hint now, it's syntax is Oracle style[1], i saw that 
many engines support a INDEX hint to force a index scan on table[2] [3], maybe 
you can have a try also.

[1] https://calcite.apache.org/docs/reference.html#sql-hints
[2] https://docs.oracle.com/cd/B13789_01/server.101/b10752/hintsref.htm#5156
[3] 
https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-table?view=sql-server-ver15

Best,
Danny Chan
在 2020年5月29日 +0800 PM4:44，Tim Fox ，写道：
> Hi,
>
> I'm building a query engine with Calcite - really enjoying working with
> Calcite so far!
>
> When creating a plan, it seems Calcite always creates a plan where the
> sources are table scans, however in my implementation the tables can have
> indexes on them so a table scan is not always the right choice.
>
> I was wondering if there was any way of making Calcite "index aware" - e.g.
> perhaps providing hints to the table scan instance that, actually, an index
> scan or a primary key lookup should be used instead of actually scanning
> the table. E.g. On the table meta-data if we provided information about any
> indexes on the table, then Calcite could figure out what parts of the query
> to push to the table scan and which to keep in the rest of the plan.
>
> There are two specific cases I really care about:
>
> 1. Queries that contain a primary key lookup:
>
> select * from some_table where key_column=23 AND some_other_column='foo';
>
> In the above case the 'select * from some_table where key_column=23' can be
> implemented as a simple PK lookup in the source table, not requiring a
> scan, thus leaving just the filter corresponding to
> 'some_other_column='foo'' in the rest of the plan
>
> 2. Queries with expressions on a column which has a secondary index
>
> select * from users where country='UK' and some_other_column='foo';
>
> We have many users, and let's say 10% of them are from UK (still a lot). We
> have a secondary index in the country column in the source table so we can
> do an efficient index scan to retrieve the matching records.
>
> I found this document
> https://calcite.apache.org/docs/materialized_views.html which seems like it
> might help me in some way.
>
> The idea being if I can think of my indexes as materialized views then the
> query can be written against those materialized views as sources instead of
> the original table sources. There appears to be a rule
> 'MaterializedViewRule' that does this already (?).
>
> This seems to get me a bit further, however, for this approach to work, it
> seems I would have to create materialized views _dynamically_ during
> evaluation of the query, register them, rewrite the query, execute it, then
> deregister the materialized view.
>
> E.g. for the primary key lookup example above, for the following query:
>
> select * from some_table where key_column=23 AND some_other_column='foo';
>
> I would need to dynamically create a materialized view corresponding to:
>
> select * from some_table where key_column=23
>
> Then rewrite the query using MaterializedViewRule.
>
> In the general case, in order to figure out what materialized views I need
> to dynamically create I would need to examine the query, figure out which
> columns in expressions have indexes on them and from them work out the best
> materialized view to create based on that information. This seems non
> trivial.
>
> Does anyone have any suggestions or pointers for how to implement this kind
> of thing? I suspect I'm not the first person to have tried to do this, as
> using indexes on tables seems a pretty common thing in many systems (?)

Re: [VOTE] Release apache-calcite-1.23.0 (release candidate 1)

2020-05-20 Thread Danny Chan

Thanks Haisheng for driving this!

Verified GPG signature - OK
Verified SHA512 - OK
Release notes - OK
Diff the files in the release artifact and tag - ok

As for Calcite 3997, it only affects the Physical node,  I’m fine with that
~

So +1 (binding) ~

Gradle 6.3 (via gradlew)
Mac OS High Sierra
JDK version "1.8.0_151"

Forward Xu 于2020年5月20日 周三上午9:38写道：

> Thanks, Haisheng!
>
>
> Environment: Mac-OS Mojave (10.14.6)
>
> JDK version: 1.8.0_211
>
>
> Release notes: OK
>
> Build with tests: OK
>
> Checksums and signatures: OK
>
>
> Vote:
>
> +1 (non-binding)
>
>
> Best,
>
> Forward
>
> XING JIN  于2020年5月19日周二 上午10:53写道：
>
> > Thanks, Haisheng ~
> >
> > Local Calcite build with tests enabled on Linux: *OK*
> > Calcite-based system test suite: *OK*
> >
> > Vote:
> > +1 (non-binding)
> >
> > Francis Chuang  于2020年5月18日周一 下午2:00写道：
> >
> > > Thanks for making this release available for voting, Haisheng!
> > >
> > > Verified GPG signature - OK
> > > Verified SHA512 - OK
> > > Ran tests per HOWTO (./gradlew check) - OK
> > > Quickly skimmed release notes - OK
> > > Spotted checked a few JARs in the Maven repository - OK
> > >
> > > Environment (OpenJDK:latest docker container):
> > > Gradle 6.3 (via gradlew)
> > > Debian GNU/Linux 8
> > > openjdk version "1.8.0_111"
> > > OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-2~bpo8+1-b14)
> > > OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode)
> > >
> > > My vote is: +1 (binding)
> > >
> > > Francis
> > >
> > > On 16/05/2020 2:02 pm, Haisheng Yuan wrote:
> > > > Hi all,
> > > >
> > > > I have created a build for Apache Calcite 1.23.0, release
> > > > candidate 1.
> > > >
> > > > Thanks to everyone who has contributed to this release.
> > > >
> > > > You can read the release notes here:
> > > >
> > >
> >
> https://github.com/apache/calcite/blob/calcite-1.23.0-rc1/site/_docs/history.md
> > > >
> > > > The commit to be voted upon:
> > > >
> > >
> >
> https://gitbox.apache.org/repos/asf?p=calcite.git;a=commit;h=b708fdc46d4c5fd4c5a6c7a398823318a7b4dce3
> > > >
> > > > Its hash is b708fdc46d4c5fd4c5a6c7a398823318a7b4dce3
> > > >
> > > > Tag:
> > > > https://github.com/apache/calcite/tree/calcite-1.23.0-rc1
> > > >
> > > > The artifacts to be voted on are located here:
> > > >
> > https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.23.0-rc1
> > > > (revision 39622)
> > > >
> > > > The hashes of the artifacts are as follows:
> > > >
> > >
> >
> 961c4f13199e199c669a6168ba655a9492bdd80d644da375a684b732c0b628b8a2ffacea5da97c82e8702a8e3bf7a1f58784baa49509fb3c48ef593259e11f46
> > > > *apache-calcite-1.23.0-src.tar.gz
> > > >
> > > > A staged Maven repository is available for review at:
> > > >
> > >
> >
> https://repository.apache.org/content/repositories/orgapachecalcite-1089/org/apache/calcite/
> > > >
> > > > Release artifacts are signed with the following key:
> > > > https://dist.apache.org/repos/dist/release/calcite/KEYS
> > > >
> > > > N.B.
> > > > To create the jars and test Apache Calcite: "./gradlew build".
> > > >
> > > > If you do not have a Java environment available, you can run the
> tests
> > > > using docker. To do so, install docker and docker-compose, then run
> > > > "docker-compose run test" from the root of the directory.
> > > >
> > > > Please vote on releasing this package as Apache Calcite 1.23.0.
> > > >
> > > > The vote is open for the next 72 hours and passes if a majority of at
> > > > least three +1 PMC votes are cast.
> > > >
> > > > [ ] +1 Release this package as Apache Calcite 1.23.0
> > > > [ ]  0 I don't feel strongly about it, but I'm okay with the release
> > > > [ ] -1 Do not release this package because...
> > > >
> > > >
> > > > Here is my vote:
> > > >
> > > > +1 (binding)
> > > >
> > > > Thanks,
> > > > Haisheng Yuan
> > > >
> > >
> >
>

Re: Calcite in research - The ReProVide Query-Sequence Optimization in a Hardware-Accelerated DBMS

2020-05-10 Thread Danny Chan

Exciting, glad to hear that Lekshmi !

Lekshmi 于2020年5月11日 周一上午3:07写道：

> Hello all,
> This happened because of the support provided by this group. I'm always
> grateful to this community.
> You could also find another paper and the video which demonstrates the
> prototype we developed using Apache Calcite(accepted and published in
> EDBT2020.)
> https://openproceedings.org/2020/conf/edbt/paper_330.pdf
>
> https://openproceedings.org/2020/conf/edbt/videos/demo-session2a/4_LekshmiAndreasKlausStefanJuergen.mp4
>
> More papers are on the way...
>
> Thanks and Regards
>
> Lekshmi B.G
> Email: lekshmib...@gmail.com
>
>
>
>
> On Sun, May 10, 2020 at 4:54 AM Julian Hyde  wrote:
>
> > Calcite is being used in a research project that accelerates queries
> using
> > FPGAs (The ReProVide Query-Sequence Optimization in a
> Hardware-Accelerated
> > DBMS, by Lekshmi B. G., Andreas Becher, Klaus Meyer-Wegener,
> > https://arxiv.org/pdf/2005.01511.pdf <
> https://arxiv.org/pdf/2005.01511.pdf
> > >.)
> >
> > "Hardware acceleration of database query processing can be done with the
> > help of FPGAs. In particular, they are partially reconfigurable at
> runtime,
> > which allows for the runtime adaption of the hardware to a variety of
> > queries.  This paper presents optimizations based on query sequences,
> > which reduces the impact of the reconfigurations."
> >
> > "We have implemented a ReProVide prototype, which is demonstrated at the
> > 2020 EDBT conference [5]. The host is running Apache Calcite [6] as a
> DBMS.
> > It has been extended with optimization rules to push down available
> > operators to RPU based on cost.”
> >
> > Always good to hear about Calcite being used in research.
> >
> > Julian
> >
> >
>

Re: Re: [ANNOUNCE] New committer: Forward Xu

2020-04-30 Thread Danny Chan

Congrations! Forward!

Best,
Danny Chan
在 2020年4月30日 +0800 PM12:14，Fan Liya ，写道：
> Congratulations, Forward!
>
> Best,
> Liya Fan
>
> On Wed, Apr 29, 2020 at 8:51 PM Wang Yanlin <1989yanlinw...@163.com> wrote:
>
> > Congrations! Forward!--
> >
> > Best,
> > Wang Yanlin
> >
> >
> >
> >
> >
> > At 2020-04-29 10:52:25, "Feng Zhu"  wrote:
> > > Congrations! Forward!
> > >
> > > best,
> > > Feng
> > >
> > > Chunwei Lei  于2020年4月29日周三 上午10:17写道：
> > >
> > > > Congrats, Forward!
> > > >
> > > >
> > > >
> > > > Best,
> > > > Chunwei
> > > >
> > > >
> > > > On Wed, Apr 29, 2020 at 6:46 AM Rui Wang  wrote:
> > > >
> > > > > Congrats!
> > > > >
> > > > >
> > > > > -Rui
> > > > >
> > > > > On Tue, Apr 28, 2020 at 3:04 PM Francis Chuang <
> > francischu...@apache.org
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Congrats, Forward!
> > > > > >
> > > > > > Francis
> > > > > >
> > > > > > On 29/04/2020 7:53 am, Stamatis Zampetakis wrote:
> > > > > > > Apache Calcite's Project Management Committee (PMC) has invited
> > > > Forward
> > > > > > Xu
> > > > > > > to
> > > > > > > become a committer, and we are pleased to announce that he has
> > > > > accepted.
> > > > > > >
> > > > > > > Forward has been helping the project for some time now. He added
> > many
> > > > > new
> > > > > > > SQL
> > > > > > > functions to the project and is one of our JSON experts. On top of
> > > > > that,
> > > > > > and
> > > > > > > other fixes, he is the one who added the Redis adapter to the
> > > > project.
> > > > > > >
> > > > > > > Forward, welcome, thank you for your contributions, and we look
> > > > forward
> > > > > > to
> > > > > > > your
> > > > > > > further interactions with the community! If you wish, please feel
> > > > free
> > > > > to
> > > > > > > tell
> > > > > > > us more about yourself and what you are working on.
> > > > > > >
> > > > > > > Stamatis (on behalf of the Apache Calcite PMC)
> > > > > > >
> > > > > >
> > > > >
> > > >
> >

Re: How to give type in case of type mismatch while doing union

2020-04-29 Thread Danny Chan

Or try SQL, it has the implicit type coercion ~

Best,
Danny Chan
在 2020年4月28日 +0800 AM4:36，Julian Hyde ，写道：
> Anjali,
>
> If you’re using RelBuilder to create the union, or creating the union 
> manually, it is your responsibility to make sure that the input RelNodes have 
> compatible types.
>
> RelDataTypeFactory.leastRestrictive(List) may be useful.
>
> Julian
>
>
> > On Apr 27, 2020, at 1:32 PM, Rui Wang  wrote:
> >
> > Did a quick test by running a SQL query that has UNION on two different
> > types. Validator gave a correct error message (not NPE) to remind type
> > mismatch.
> >
> > Agreed with Jin, could you provide more context/example how you reach the
> > NPE? (It could be better if you can file a Jira with your context).
> >
> >
> > -Rui
> >
> > On Mon, Apr 27, 2020 at 4:35 AM XING JIN  wrote:
> >
> > > Hi, Anjali ~
> > > Are you doing the UNION by Sql ? If so, can you give the Sql content ?
> > > Are you doing the UNION on RelNodes ?, If so, you need to do type CAST.
> > >
> > > Jin
> > >
> > > Anjali Shrishrimal  于2020年4月27日周一
> > > 下午4:25写道：
> > >
> > > > Hi,
> > > >
> > > > While doing union of 2 RelNodes with different types, I am getting NPE.
> > > (I
> > > > am using calcite 1.21.0)
> > > > java.lang.NullPointerException: at index 0
> > > > at
> > > >
> > > com.google.common.collect.ObjectArrays.checkElementNotNull(ObjectArrays.java:225)
> > > > at
> > > >
> > > com.google.common.collect.ObjectArrays.checkElementsNotNull(ObjectArrays.java:215)
> > > > at
> > > >
> > > com.google.common.collect.ObjectArrays.checkElementsNotNull(ObjectArrays.java:209)
> > > > at
> > > > com.google.common.collect.ImmutableList.construct(ImmutableList.java:346)
> > > > at
> > > > com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:258)
> > > > at
> > > >
> > > org.apache.calcite.rel.type.RelDataTypeFactoryImpl.canonize(RelDataTypeFactoryImpl.java:373)
> > > > at
> > > >
> > > org.apache.calcite.rel.type.RelDataTypeFactoryImpl.createStructType(RelDataTypeFactoryImpl.java:155)
> > > > at
> > > >
> > > org.apache.calcite.rel.type.RelDataTypeFactoryImpl.createStructType(RelDataTypeFactoryImpl.java:146)
> > > > at
> > > >
> > > org.apache.calcite.rel.type.RelDataTypeFactory$Builder.build(RelDataTypeFactory.java:569)
> > > > at
> > > >
> > > org.apache.calcite.rel.type.RelDataTypeFactoryImpl.leastRestrictiveStructuredType(RelDataTypeFactoryImpl.java:257)
> > > > at
> > > >
> > > org.apache.calcite.sql.type.SqlTypeFactoryImpl.leastRestrictiveSqlType(SqlTypeFactoryImpl.java:285)
> > > > at
> > > >
> > > org.apache.calcite.sql.type.SqlTypeFactoryImpl.leastRestrictive(SqlTypeFactoryImpl.java:156)
> > > > at
> > > > org.apache.calcite.rel.core.SetOp.deriveRowType(SetOp.java:107)
> > > >
> > > > If the column types (family types) are different, currently the derived
> > > > type is null. Is there any way to control that?
> > > > Where can I define the type in case of mismatch ?
> > > >
> > > >
> > > > Thank you,
> > > > Anjali Shrishrimal
> > > >
> > >
>

Re: [ANNOUNCE] New committer: Wang Yanlin

2020-04-29 Thread Danny Chan

Congrats, Wang Yanlin!

Best,
Danny Chan
在 2020年4月29日 +0800 AM8:26，dev@calcite.apache.org，写道：
>
> Congrats, Wang Yanlin!

Re: Re: [ANNOUNCE] New committer: Jin Xing

2020-04-29 Thread Danny Chan

Congrats, Jin Xing!

Best,
Danny Chan
在 2020年4月30日 +0800 PM12:15，dev@calcite.apache.org，写道：
>
> Congrats, Jin Xing!

Re: [Tests Failing] Master Travis test fails continuously for JDK14

2020-04-28 Thread Danny Chan

I have deleted the master cache, hope it helps ~

Best,
Danny Chan
在 2020年4月28日 +0800 AM10:53，Michael Mior ，写道：
> I have my accounts linked, but I see " Your permissions are
> insufficient to access this content."
> --
> Michael Mior
> mm...@apache.org
>
> Le lun. 27 avr. 2020 à 13:10, Kevin Risden  a écrit :
> >
> > We should be able to clear the Travis cache by ourselves.
> >
> > If you have your github and asf accounts linked:
> >
> > https://travis-ci.org/github/apache/calcite/caches
> >
> > That page is linked under the more options in the top right of the Calcite
> > page. You should be able to clear all or some subset of caches.
> >
> > Kevin Risden
> >
> >
> > On Mon, Apr 27, 2020 at 8:53 AM Vladimir Sitnikov <
> > sitnikov.vladi...@gmail.com> wrote:
> >
> > > > Log an issue there ?
> > >
> > > An issue for INFRA project would probably do:
> > > https://issues.apache.org/jira/projects/INFRA
> > >
> > > Vladimir
> > >

Re: [Tests Failing] Master Travis test fails continuously for JDK14

2020-04-27 Thread Danny Chan

Thanks, how could I ask them to do that ? Log an issue there ?

Best,
Danny Chan
在 2020年4月27日 +0800 PM5:38，Vladimir Sitnikov ，写道：
> I guess the solution is to ask INFRA to reset all the Travis caches for
> Calcite.
>
> Vladimir

[Tests Failing] Master Travis test fails continuously for JDK14

2020-04-27 Thread Danny Chan

Here is an example https://travis-ci.org/github/apache/calcite/jobs/679970288

Can someone help with that ? Thanks ~

Best,
Danny Chan

Re: [ANNOUNCE] New committer: Vineet Garg

2020-04-26 Thread Danny Chan

Congrats, Vineet!

Best,
Danny Chan
在 2020年4月26日 +0800 PM1:55，dev@calcite.apache.org，写道：
>
> Congrats, Vineet!

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-22 Thread Danny Chan

> Is there any recommended approach to make that happen smoothly besides
coding and testing work? We need to be aware that the new planner might be
co-exist with VolcanoPlanner for 5 or more years, or even never replace
VolcanoPlanner.

If that is true, i might say the new planner is probably with a not that
good design, we expect to see in advance for what cases/reasons user has
the reason to keep the old VolcanoPlanner and we *must* give a solution for
those problems in the new design.

I was expecting that migrating to a new planner would at least take 1 year
for developing, if that is true, modifying directly based on current
planner means for the near future 3~4 versions Calcite, there would bring
in huge plan changes/bugs for each release which i believe all the users of
Calcite don't want to see. And on one can guarantee that modifying directly
can keep good stability and compatibility, only the test set do.

>From the experience of Alibaba Blink planner which has contributed to
Apache Flink, yes, the old/new planner would co-exist at least for 2 years.
For the reasons that the new and old planner has different ability in some
corner cases.

>From my point of view, we should at least:
- Give a convincing test set for the new planner that makes us believe the
new planner is stable and powerful enough. I mean obviously the current
rule tests are far away from enough to support the new planner
- We should give a more detailed design doc about the new planner,
especially about the interfaces changes and any change that would bring in
the compatibility problem. Then we can make more accurate decision how much
work the new planner would bring in, until then, we can decide if switch to
a pure new planner development is a good idea or modify the existing one.


Haisheng Yuan  于2020年4月22日周三 上午9:45写道：

> Hi Andrii,
>
> > Obviously, from what is written here, I could guess that this would
> require me to change my physical planning rules, even if only by
> implementing a marker interface.
> You don't need to change your physical rules, it will be treated as equal
> as logical rules and be applied together with the real logical rules, no
> more logical/physical rules difference. This is also how current
> VolcanoPlanner works.
>
> > I don't want you to think that I somehow resent the changes you are
> pushing.
> Don't get me wrong. I am seriously thinking of revert these changes, since
> most people like the idea of adding new planner, why don't we make all the
> plan changes in the new planner, instead of forcing people changing test
> cases for the code changes that they might not need in VolcanoPlanner
> during upgrade.
>
> I didn't intend to replace VolcanoPlanner, thought just change the search
> strategy and add trait derivation mechanism, because most of the code in
> VolcanoPlanner can be reused. But since many agree to add new planner and
> replace VolcanoPlanner as the final goal, I won't be against most people's
> decision.
>
> Is there any recommended approach to make that happen smoothly besides
> coding and testing work? We need to be aware that the new planner might be
> co-exist with VolcanoPlanner for 5 or more years, or even never replace
> VolcanoPlanner.
>
> More thoughts are welcome.
>
> Haisheng
>
> On 2020/04/21 19:56:25, Андрей Цвелодуб  wrote:
> > Hello Haisheng,
> >
> > > To keep backward compatibility, all the un-marked rules will be treated
> > as logical rules, except rules that uses AbstractConverter as rule
> operand,
> > these rules still need to applied top-down, or random order.
> > Obviously, from what is written here, I could guess that this would
> require
> > me to change my physical planning rules, even if only by implementing a
> > marker interface. I am not saying this is a bad thing, but this is a
> thing
> > that should be communicated and planned ahead in case the VolcanoPlanner
> is
> > modified.
> >
> > > Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> > because they will cause another tons of plan changes.
> > I see you are still bitter due to all the discussions on this list
> lately,
> > I'm sorry. I don't want you to think that I somehow resent the changes
> you
> > are pushing, au contraire I support them and would be happy to help if I
> > can. I just want the process of these changes to be executed in the best
> > possible way.
> > As I see there are already several opinions in this thread that basically
> > align with what I am saying, so I guess I am not the crazy guy running
> > around and yelling "the end is nigh!".
> >
> > Thank you for taking these mumbled thoughts into account.
> >
> > Bestest Regards,
> > Andrii Tsvielodub
> >
> > On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan  wrote:
> >
> > > Hi Andrii,
> > >
> > > > I guess changing the planner would lead to changes in tons of rules
> and
> > > even more tests.
> > > Obviously you didn't read through my email. You are not required to do
> any
> > > changes to your rule if you

Re: Calcite geode adapter query

2020-04-02 Thread Danny Chan

I believe few people in the community are familiar with geode, maybe you can 
ask the author of this module [1]

[1] https://github.com/tzolov

Best,
Danny Chan
在 2020年4月2日 +0800 PM6:41，aashish choudhary ，写道：
> Can anyone help me with this?
>
> With best regards,
> Ashish
>
> On Wed, Apr 1, 2020, 7:27 PM aashish choudhary 
> wrote:
>
> > Hi,
> >
> > I am trying to setup querying geode using calcite geode adapter. So far i
> > am successfully able to connect to geode cluster which is SSL enabled after
> > making few changes in the adapter and query data for regions which are
> > having flat hierarchy.
> >
> > However I am not able to run any query for region having Array list as
> > values and String as key. I am using Dbeaver to query following this link.
> > https://youtu.be/I_iaMgqdV3Q
> >
> > Can someone help me with this?
> >
> > With geode you can query those complex region in geode Pulse data browser.
> > You can do .entrySet or .entries to run queries like this.
> >
> > Also in Dbeaver in under Tables->Columns I can view fields as well for
> > regions having flat hierarchy but it doesn't show anything for regions
> > storing arrays.
> >
> >
> > With best regards,
> > Ashish
> >

Re: Asking help for parsing MySql Alter Table statements

2020-04-02 Thread Danny Chan

Dear , you are right that Calcite does not support ALTER TABLE yet.

ALTER TABLE is meaningful and necessary when the engine has a 
Catalog/MetadataStore that can persist the table definitions. Calcite does not 
support external metadata-store read/write, we only support static schema 
read(from json).

Here is the full syntax for DDL so far [1]

[1] 
https://github.com/apache/calcite/blob/master/server/src/main/codegen/includes/parserImpls.ftl

Best,
Danny Chan
在 2020年3月27日 +0800 AM8:23，Antonio Gelameris 
，写道：
> Hello to everyone!
> I was trying to understand whether it's possible to parse ALTER TABLE
> statements or not (like the ones supported in MYSQL).
> After some hours googling, I have the feeling that according to
> documentation (https://calcite.apache.org/docs/reference.html#ddl-extensions)
> it's not possible (they're not mentioned), neither using
> SqlDdlParserImpl.FACTORY in calcite-server, since the internal
> representation required to describe a state change like this is too
> different from the current one, am I right?
> Thanks for your time and help!
> ---
> Antonio
>
> --
>
>
>
>
> Listed on SIX Swiss Exchange
>
>
> Our websites: lmgroup.lastminute.com |
> <http://lmgroup.lastminute.com/> lastminute.com <http://lastminute.com/> |
> weg.de <http://weg.de/> | jetcost.com <http://jetcost.com/> |
> volagratis.com <http://volagratis.com/> | rumbo.es <http://rumbo.es/> |
> bravofly.com <http://bravofly.com/> | hotelscan <https://hotelscan.com/it>
> | crocierissime <https://www.crocierissime.it/>
>
>
> This e-mail may contain
> confidential and/or privileged information. If you are not the intended
> recipient (or have received this e-mail in error) please notify the sender
> immediately and destroy this e-mail. Any unauthorized copying, disclosure
> or distribution of the material in this e-mail is strictly forbidden and
> could be a crime. The contents of this email are not personal and therefore
> the answers to this email may be known, within the limits allowed by
> company policies, by the organization of the sender.

Re: Draft board report for April 2020

2020-04-02 Thread Danny Chan

Thanks for the summary Stamatis ~

Sigh to see the activity levels decrease, that may because some of the most 
committers/PMC has less activity recently, such as me, personally I’m really 
busy these days for my Flink work. But I have always try to check the daily 
mailing list and give some help if I have time (on the weekend).

I think things would be better and better, Calcite is a great project and we 
deserve to have more active committers in the future ~

Best,
Danny Chan
在 2020年4月2日 +0800 AM6:08，Stamatis Zampetakis ，写道：
> Attached below is a draft of this month's board report. I plan to submit it
> on April 6. Please let me know if you have any additions or corrections.
>
> ## Description:
> Apache Calcite is a highly customizable framework for parsing and planning
> queries on data in a wide variety of formats. It allows database-like
> access,
> and in particular a SQL interface and advanced query optimization, for data
> not residing in a traditional database.
>
> Avatica is a sub-project within Calcite and provides a framework for
> building
> local and remote JDBC and ODBC database drivers. Avatica has an independent
> release schedule and its own repository.
>
> ## Issues:
> There are no issues requiring board attention.
>
> ## Membership Data:
> Apache Calcite was founded 2015-10-22 (4 years ago)
> There are currently 46 committers and 22 PMC members in this project.
> The Committer-to-PMC ratio is roughly 2:1.
>
> Community changes, past quarter:
> - No new PMC members. Last addition was Haisheng Yuan on 2019-11-11.
> - Feng Zhu was added as committer on 2020-02-29
>
> ## Project Activity:
> Avatica 1.16.0 was released in the middle of December, including numerous
> bug fixes and security improvements while the build system has been
> migrated from Maven to gradle.
>
> Calcite 1.22.0 was released at the beginning of March, including more than
> 250 resolved issues including long-awaited features such as support of SQL
> hints, important bug fixes in the core of the query planner, and many
> changes in build and test infrastructure. This release was the biggest in
> the history of the project and one of the most complicated to pull out due
> to the big number of contributions.
>
> An introductory talk about query planning and Calcite was given at the
> university of Cergy-Pontoise, Paris, France, in March. The audience was not
> aware of the existence of the project which shows that we should organize
> such events more often, at least in
> Europe, if we want to expand our community. [Do we have something else to
> add here?]
>
> Finally, we were informed that the SuperSQL project by Tencent uses Calcite
> in order to provide a unified entrance to various data platforms such as
> RDBMS, ElasticSearch, Hive, Flink, Spark, Presto, ClickHouse, etc.
>
> ## Community Health:
>
> Activity levels on mailing lists (-8%), git (-24%) and JIRA (opened -20%,
> closed
> -30%) have decreased in the first quarter of 2020. Most of the big changes
> that started during the last quarter of 2019 (and initiated a lot of
> activity) have
> been finalized and incorporated in the release of Calcite 1.22.0, making
> things
> calmer in the community. The big decrease of closed issues in JIRA may be
> explained by the frequency of releases that have slightly dropped. To the
> above it is worth adding the period of holidays in December and January
> that in general slows things down.
>
> Following the decreased activity in the lists, git, and JIRA it is normal
> to see
> the activity rates on pull requests drop (-20% opened, -31% closed) more or
> less
> for the same reasons. Although our backlog keeps increasing (~174 open
> pullvrequests), the community is very reactive commenting and reviewing
> almost every contribution.

Re: Parsing Teradata SQL with Apache Calcite

2020-03-29 Thread Danny Chan

If you only want to add some extension operators for TD, modify 
SqlLibraryOperators.java is okey.

If you also want to interact with TD through JDBC connection, a TD SqlDialect 
may also need to be introduced.

Best,
Danny Chan
在 2020年3月28日 +0800 PM5:32，David Valenzuela Rodriguez 
，写道：
> Hi,
>
> In order to define the TD library, it looks like I'd basically need to
> modify the following:
>
> -
> https://github.com/apache/calcite/blob/888dd3a7d20ad04b22434f16d96177b7ca2c28aa/core/src/main/java/org/apache/calcite/sql/fun/SqlLibraryOperators.java
> -
> https://github.com/apache/calcite/blob/52a57078ba081b24b9d086ed363c715485d1a519/core/src/main/java/org/apache/calcite/sql/fun/SqlLibrary.java
>
>
> Is that accurate?
>
> Thanks!
> David
>
> On Thu, Mar 26, 2020 at 5:13 PM David Valenzuela Rodriguez <
> davi...@groupon.com> wrote:
>
> > Awesome, thanks!
> >
> > David
> >
> > On Thu, Mar 26, 2020 at 4:02 PM Julian Hyde  wrote:
> >
> > > There are a few related concepts that make up the general idea of
> > > "dialect":
> > > * *dialect* determines how the JDBC adapter convert SqlNodes to SQL
> > > (e.g. generate "LIMIT 5" or "FETCH 5 ROWS")
> > > * *conformance* determines what features the validator will allow
> > > (e.g. does "GROUP BY 1" mean group by the first field or the literal
> > > 1)
> > > * *parserConfig* parameters control the parser, for example
> > > identifiers (quoted using brackets, or back-ticks or double-quotes),
> > > and how it treats the case of quoted and unquoted identifiers
> > > * *library* is a set of built-in functions. For example, you might
> > > want to use standard SQL functions plus Oracle's function set (e.g.
> > > DECODE)
> > > * *type system* determines things such as what is the maximum length
> > > of a VARCHAR value
> > >
> > > These are intentionally kept separate in the code. We built a bridge
> > > between dialect parserConfig in
> > > https://issues.apache.org/jira/browse/CALCITE-3050. It's worth reading
> > > that issue for the back-story.
> > >
> > > To parse Teradata SQL, you probably need to create a library (for
> > > functions) and a conformance. Maybe a new constant Lex.TERADATA.
> > >
> > > Julian
> > >
> > >
> > > If you want to parse Tera
> > >
> > > On Thu, Mar 26, 2020 at 3:44 PM David Valenzuela Rodriguez
> > >  wrote:
> > > >
> > > > Hi!
> > > >
> > > > I'm currently evaluating the Apache Calcite framework for the purposes
> > > of
> > > > parsing Teradata SQL. As far as I can tell, this is accomplished by
> > > > defining a SqlOperatorTable with TD-specific operators. Other than that,
> > > > I'm not really sure how to integrate that SqlOperatorTable into the
> > > > Parser/Planner. Are there any guidelines or examples of this? I haven't
> > > > really seen this, but I assume this is not something new.
> > > >
> > > > Thanks in advance!
> > > >
> > > > --
> > > > David Valenzuela
> > > > Software Engineer | Data & Discovery | GROUPON
> > > > mobile 425.614.5243
> > > > davi...@groupon.com
> > >
> >
> >
> > --
> > David Valenzuela
> > Software Engineer | Data & Discovery | GROUPON
> > mobile 425.614.5243
> > davi...@groupon.com
> >
>
>
> --
> David Valenzuela
> Software Engineer | Data & Discovery | GROUPON
> mobile 425.614.5243
> davi...@groupon.com

Re: STREAM keyword

2020-03-24 Thread Danny Chan

In Apache Flink, we have a syntax:

… A JOIN B for SYSTEM_TIME AS OF A.PROC_TIME

To describe a stream A join a temporal table B where we only want to join the 
records with the current machine time as the time point of table B.

Is that the case Viliam described ?


Best,
Danny Chan
在 2020年3月25日 +0800 AM12:46，Julian Hyde ，写道：
> You’re right that this is a problem.
>
> We’d need some way to say that you don’t care which version of the product 
> table you are joining against. One implication would be that if you replay 
> the query, and the product table has changed in the mean time, you are happy 
> to get different results.
>
> We could devise some syntax to add to the SQL. And/or we could add some 
> annotation to the product TVR. What do you think?
>
> Julian
>
>
> > On Mar 24, 2020, at 12:11 AM, Viliam Durina  wrote:
> >
> > So how would you do a simple stream enrichment query? That is one that for
> > each new record in an append-only relation will join a matching record from
> > a mutable relation that's valid at the processing time? This use case is
> > common, for example in credit card fraud detection, for each transaction
> > you look up the cardholder statistics, merchant statistics, product
> > statistics, transaction history etc. that you have at hand at the moment
> > the transaction is processed and the enriched record is then fed to a
> > rule-based engine or to an ML inference model. You're not interested in
> > later updates in those enrichment tables. In my understanding it is not
> > possible with the proposed semantics.
> >
> > For example, can you refer to the `undo`, `ptime` and `ver` columns in the
> > query itself? We could filter out columns where `ver > 0`:
> >
> > SELECT (
> > SELECT *
> > FROM order_item o
> > JOIN product p USING(product_id)
> > EMIT STREAM
> > ) WHERE ver = 0;
> >
> > You can optimize for the common events, and not use very much memory. For
> > > the rarer events, you can pay the cost of a disk I/O.
> > >
> >
> > With the particular query I don't think you can do this. Let's say the
> > `order_item` is backed by a Kafka topic - you might not have the full
> > history. And even if you do, the receiver of the query results is not
> > interested in retractions and new versions of all the zillions of orders
> > with updated product name. The desired output should be specified by the
> > query itself. And, for example, cardholder statistics could be updated with
> > each transaction in a feedback loop.
> >
> > Viliam
> >
> > --
> > This message contains confidential information and is intended only for the
> > individuals named. If you are not the named addressee you should not
> > disseminate, distribute or copy this e-mail. Please notify the sender
> > immediately by e-mail if you have received this e-mail by mistake and
> > delete this e-mail from your system. E-mail transmission cannot be
> > guaranteed to be secure or error-free as information could be intercepted,
> > corrupted, lost, destroyed, arrive late or incomplete, or contain viruses.
> > The sender therefore does not accept liability for any errors or omissions
> > in the contents of this message, which arise as a result of e-mail
> > transmission. If verification is required, please request a hard-copy
> > version. -Hazelcast
>

Re: [DISCUSS] get RexExecutor from RexSimplify in method reduceExpressionsInternal

2020-03-19 Thread Danny Chan

This is a preference, I would prefer the default value to not throw exceptions.

Best,
Danny Chan
在 2020年3月18日 +0800 PM3:53，Stamatis Zampetakis ，写道：
> If a Janino exception comes up then it is a bug that we have to fix since
> it violates the contract of the interface.
>
> From my point of view the modification is meaningful for two reasons:
> * improves code readability;
> * avoids confusing behavior where the rules for performing
> constant reduction are present but this does not really happen (because
> there is an executor missing).
>
> I would say that in production, if the engine does not want to perform
> constant reduction, it is equally easy to not register the respective rules.
>
> Best,
> Stamatis
>
> On Wed, Mar 18, 2020 at 3:29 AM Danny Chan  wrote:
>
> > I’m a little worried about it the default RexExecutorImpl can handle all
> > the downstream projects expressions, and very probably not, there would be
> > some Janino compile exception if it can not translate the RexNodes
> > correctly.
> >
> > So strictly to say, change the RexExecutor to a default implementation may
> > break something. I think it’s better if we have a real case to illustrate
> > that the modification is meaningful.
> >
> > In production, if an engine really wants to support constant reduction for
> > their all kinds of expression, they should set up the RexExecutor
> > explicitly. If they do not set up that, the constant reduction just not
> > happens, it is better than supplying a default RexExecutor but does not
> > really work for all expression.
> >
> > So I’m +0 for this.
> >
> > Best,
> > Danny Chan
> > 在 2020年3月17日 +0800 PM4:16，JiaTao Tao ，写道：
> > > Hi Danny
> > >
> > > Thanks for your reply, I think Stamatis Zampetakis's opinion is
> > summative,
> > > and here the problem I think is a default RexExecutor is better than
> > null,
> > > especially, in this case, cuz `reduceExpressionsInternal` and
> > > `reduceExpressions` is in the same path, thought the use of RexExecutor
> > may
> > > be different, but it still makes people confusing.
> > >
> > > IMHO, if "return RexUtil.EXECUTOR" >= "return null", we can do the
> > modify.
> > >
> > > If you think so, I can open a JIRA and do this minor change.
> > >
> > > Hope to hear your voice.
> > >
> > > Regards!
> > >
> > > Aron Tao
> > >
> > >
> > > JiaTao Tao  于2020年3月17日周二 下午4:02写道：
> > >
> > > > Hi Stamatis Zampetakis
> > > >
> > > > I agree with this completely: "The API of RexExecutor says the
> > following
> > > > "If an expression cannot be
> > > > reduced, writes the original expression..." so we don't break anything
> > by
> > > > providing a default one."
> > > >
> > > >
> > > >
> > > > Regards!
> > > >
> > > > Aron Tao
> > > >
> > > >
> > > > Stamatis Zampetakis  于2020年3月16日周一 下午9:52写道：
> > > >
> > > > > Interestingly, I was looking at this same piece of code not so long
> > ago
> > > > > and
> > > > > I agree it is a bit confusing.
> > > > >
> > > > > Looking around the places that we obtain a RexExecutor, most often
> > > > > (always?) we observe the following pattern:
> > > > >
> > > > > RexExecutor executor =
> > > > > Util.first(query.getCluster().getPlanner().getExecutor(),
> > > > > RexUtil.EXECUTOR);
> > > > >
> > > > > I think it is always useful to have an executor in the planner thus
> > I am
> > > > > tempted to change the API of RelOptPlanner#getExecutor to always
> > return an
> > > > > (default) executor if an explicit one is not set.
> > > > >
> > > > > The API of RexExecutor says the following "If an expression cannot be
> > > > > reduced, writes the original expression..." so we don't break
> > anything by
> > > > > providing a default one.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Best,
> > > > > Stamatis
> > > > >
> > > > > On Mon, Mar 16, 2020 at 11:11 AM Danny Chan 
> > wrote:
> > > > >
> > > > > > Thanks, the code is a little mess, here

Re: [DISCUSS] get RexExecutor from RexSimplify in method reduceExpressionsInternal

2020-03-17 Thread Danny Chan

I’m a little worried about it the default RexExecutorImpl can handle all the 
downstream projects expressions, and very probably not, there would be some 
Janino compile exception if it can not translate the RexNodes correctly.

So strictly to say, change the RexExecutor to a default implementation may 
break something. I think it’s better if we have a real case to illustrate that 
the modification is meaningful.

In production, if an engine really wants to support constant reduction for 
their all kinds of expression, they should set up the RexExecutor explicitly. 
If they do not set up that, the constant reduction just not happens, it is 
better than supplying a default RexExecutor but does not really work for all 
expression.

So I’m +0 for this.

Best,
Danny Chan
在 2020年3月17日 +0800 PM4:16，JiaTao Tao ，写道：
> Hi Danny
>
> Thanks for your reply, I think Stamatis Zampetakis's opinion is summative,
> and here the problem I think is a default RexExecutor is better than null,
> especially, in this case, cuz `reduceExpressionsInternal` and
> `reduceExpressions` is in the same path, thought the use of RexExecutor may
> be different, but it still makes people confusing.
>
> IMHO, if "return RexUtil.EXECUTOR" >= "return null", we can do the modify.
>
> If you think so, I can open a JIRA and do this minor change.
>
> Hope to hear your voice.
>
> Regards!
>
> Aron Tao
>
>
> JiaTao Tao  于2020年3月17日周二 下午4:02写道：
>
> > Hi Stamatis Zampetakis
> >
> > I agree with this completely: "The API of RexExecutor says the following
> > "If an expression cannot be
> > reduced, writes the original expression..." so we don't break anything by
> > providing a default one."
> >
> >
> >
> > Regards!
> >
> > Aron Tao
> >
> >
> > Stamatis Zampetakis  于2020年3月16日周一 下午9:52写道：
> >
> > > Interestingly, I was looking at this same piece of code not so long ago
> > > and
> > > I agree it is a bit confusing.
> > >
> > > Looking around the places that we obtain a RexExecutor, most often
> > > (always?) we observe the following pattern:
> > >
> > > RexExecutor executor =
> > > Util.first(query.getCluster().getPlanner().getExecutor(),
> > > RexUtil.EXECUTOR);
> > >
> > > I think it is always useful to have an executor in the planner thus I am
> > > tempted to change the API of RelOptPlanner#getExecutor to always return an
> > > (default) executor if an explicit one is not set.
> > >
> > > The API of RexExecutor says the following "If an expression cannot be
> > > reduced, writes the original expression..." so we don't break anything by
> > > providing a default one.
> > >
> > > What do you think?
> > >
> > > Best,
> > > Stamatis
> > >
> > > On Mon, Mar 16, 2020 at 11:11 AM Danny Chan  wrote:
> > >
> > > > Thanks, the code is a little mess, here is how I understand it:
> > > >
> > > > The executor from `final RexExecutor executor
> > > > = Util.first(cluster.getPlanner().getExecutor(), RexUtil.EXECUTOR)` is
> > > > mainly used to construct the RexSimplify, in the RexSimplify, the
> > > > expression that we evaluate is what we can make sure RexUtil.EXECUTOR
> > > can
> > > > resolve(if you check the code, it only reduce the literals).
> > > >
> > > > But the expressions in the ReduceExpressionsRule may be more complex,
> > > > somehow we must relay on the engine to plugin their RexExecutor to make
> > > a
> > > > constant reduction(some engine use code generation, some use Java
> > > > reflection).
> > > >
> > > > So, in total, the executor in RexSimplify has a fallback is because it’s
> > > > expression to reduce is simple enough.
> > > >
> > > > Best,
> > > > Danny Chan
> > > > 在 2020年3月16日 +0800 PM3:57，JiaTao Tao ，写道：
> > > > > In method reduceExpressionsInternal, we get RexExecutor from cluster,
> > > it
> > > > can be null:
> > > > > <>
> > > > >
> > > > > But in the outside(reduceExpressions), `final RexExecutor executor =
> > > > Util.first(cluster.getPlanner().getExecutor(), RexUtil.EXECUTOR)`, it
> > > can't
> > > > be null.
> > > > >
> > > > > And reduceExpressions is the only caller of reduceExpressionsInternal,
> > > > so I think this is an inconsistent behavior.
> > > > >
> > > > > IMHO, we should create RexUtil.EXECUTOR if it is null
> > > > in reduceExpressionsInternal, or just get RexExecutor from RexSimplify.
> > > > > <>
> > > > >
> > > > > Regards!
> > > > > Aron Tao
> > > >
> > >
> >

1 2 3 4 >

1 - 100 of 345 matches

Mail list logo