Re: New rule for Converting UNION ALL with same inputs but different filters to single input with OR FILTER

2022-01-22 Thread Albert
On a more general fusion case, you could probably refer to osdi paper
"resin general sub query fusion" which has chapters talking about general
union fusion that deals with the case Julian pointed out.

Btw, per my understanding cases of union fusion is not that many in
practice. Curious to know the real cases

在 2022年1月21日星期五,Yanjing Wang  写道:

> Hi Julian, I have logged a jira CALCITE-4990
>  for this new rule,
> later I will submit a pr for this.
>
> Julian Hyde  于2022年1月21日周五 01:37写道:
>
> > Process logic sounds great. Please log a Jira case.
> >
> > There’s a lot here, so it’s probably wise to split into a simple PR that
> > just does the simple case (UNION ALL, no Project) and extend later.
> >
> > > On Jan 20, 2022, at 7:07 AM, Yanjing Wang 
> > wrote:
> > >
> > > Thanks Julian and Justin.
> > >
> > > What do you think the rule should be called? UnionFilterTransposeRule,
> > >> perhaps?
> > >>
> > > SetOpFilterMergeRule?
> > >
> > > Maybe that problem does not occur when applied to UNION than when
> applied
> > >> to UNION ALL.
> > >
> > > Yes, This is very important.
> > >
> > > There would seem to be analogous rules for INTERSECT (combine the
> > >> conditions using AND) and EXCEPT (combine the conditions using AND
> NOT).
> > >>
> > > Excellent extensions, all the three operators process logic are:
> > >
> > > For UNION:
> > > New Filter = left Filter *OR* right Filter.
> > >
> > > For INTERSECT:
> > > New Filter = left Filter *AND* right Filter.
> > >
> > > For EXCEPT:
> > > If left Filter, right Filter have no overlap, transform to left child
> > tree,
> > > Otherwise
> > > New Filter = left Filter *AND NOT *right Filter.
> > >
> > > For UNION ALL:
> > > Add prerequisites:
> > > left Filter, right Filter must have no overlap.
> > >
> > > For INTERSECT ALL:
> > > Add prerequisites:
> > > If left Filter, right Filter have no overlap, transform to empty
> values.
> > > Otherwise
> > > the rule can't be applied.
> > >
> > > For EXCEPT ALL:
> > > same for  EXCEPT.
> > >
> > >
> > > work for N-way Union, not just 2-way Union.
> > >>
> > > Yes, I will add tests for this.
> > >
> > > And I think you should make it work whether or not a Project is
> present.
> > >
> > > Ok, It seems I need construct several different operand match trees for
> > > plan.
> > >
> > > Hi Julian, what do you think of the above process logic?
> > >
> > > Julian Hyde mailto:jhyde.apa...@gmail.com>>
> > 于2022年1月20日周四 10:18写道:
> > >
> > >> Justin,
> > >>
> > >> For planning table or index scans, I would recommend using a single
> > >> TableScan with a Filter that uses a Sarg, rather than using multiple
> > >> TableScans connected by a Union. So I think this rule will be useful.
> > >>
> > >> But I do agree that this proposed rule is not a “no brainer”. It may
> not
> > >> do what people want/expect in all cases, and therefore it probably
> > should
> > >> not be enabled it by default.
> > >>
> > >> Julian
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>> On Jan 19, 2022, at 3:38 PM, Justin Swanhart 
> > >> wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> Note that this will negate the optimization that one usually is
> looking
> > >> for
> > >>> when writing such queries:
> > >>>
> > >>> Select * from TAB where a = 1
> > >>> UNION ALL
> > >>> Select * from TAB where b = 1
> > >>>
> > >>> In a database with indexes (most databases) this will allow indexes
> to
> > be
> > >>> used on both the a column and the b column.
> > >>> Databases with bitmap indexes or without indexes would benefit from
> the
> > >>> rule.
> > >>>
> > >>> On Wed, Jan 19, 2022 at 4:32 PM Julian Hyde  > >> >>
> wrote:
> > >>>
> >  Can you log a Jira case for this?
> > 
> >  I think you should make your rule work for N-way Union, not just
> 2-way
> >  Union. And I think you should make it work whether or not a Project
> is
> >  present.
> > 
> > > On Jan 19, 2022, at 1:25 PM, Julian Hyde  > >
> > >> wrote:
> > >
> > > It sounds useful.
> > >
> > > What do you think the rule should be called?
> > UnionFilterTransposeRule,
> >  perhaps?
> > >
> > > A challenge when writing the rule will be to ensure that all of the
> >  inputs to the Union are the same. The Volcano framework is not very
> > >> good at
> >  that.
> > >
> > > You should be careful of the case that the conditions overlap. For
> >  example, the rewrite
> > >
> > > SELECT * FROM Emp WHERE deptno < 30
> > > UNION ALL
> > > SELECT * FROM Emp WHERE deptno IN (25, 35, 45)
> > >
> > > to
> > >
> > > SELECT * FROM Emp WHERE deptno < 30 OR deptno IN (25, 35, 45)
> > >
> > > Is not valid, because rows with deptno = 25 will appear twice in
> the
> >  first query, once in the second. Maybe that problem does not occur
> > when
> >  applied to UNION than when 

Re: [DISCUSS] Trolls and community

2021-12-22 Thread Albert
I echo Jacques’ nice and decent comments on this.

*"""*











*- I think Julian and Vladimir are far and above the most
prolificcontributors on the project (by any measure I can use: code,
commits, emailresponses, longevity, consistency, releases, etc). [3]- I
think Julian and Vladimir typically have the strongest opinions. (Thisis
pretty standard, the most prolific contributors typically have thestrongest
opinions.)- I think we have a general challenge around power asymmetry
(people withpower aren't always aware of how this power influences
others).- I think Julian and Vladimir are the two project titans. People
frequentlydefer to them. When they go "mano a mano", it is challenging as
there isless deference and more conflict. It also can devolve.*
*"""*

I was scared that either Vladimir or Julian quit.  Both gave a lot of help
to me when I contributed.

To be honest I wasn't expecting
Julian's hard time on the friction with Vladimir, I always treated it as an
example of different opinions
co-exist within a community, and sometimes I even talk about it as
illustrations.
Obviously I didn't grasp the sentiments underneath.

But, there should be nothing personal after all. let's not forget the goal
first "community over code"

it's probably worth going through the friction items since Vladimir feels
*"Apparently, the community does not want me or my ideas."*

- maven to gradle
+1, I'm not familiar with kotlin or gradle, it works and things seem
organized better than before
- fuzzing tests
+1, I mean I probably have problems maintaining them, but I feel this is
the correct thing to do.
- checker framework
+1, also new stuff for me, but I hate nulls
- migrating off jira
+0, I kind of understand the eagerness of vladimir trying to improve
something, but I also feel JIRA works fine.
- auto update release notes
+1, no reason to object to automating routines for everyone because of
human mistakes.
With that said we don't have one at hand, so following the procedure for
this time is a workaround.

Thoughts from different perspectives are good, yet they are challenging to
accept.
we should have focused more on the facts of the issues but usually we melt
into the sentiments underneath the wordings.

All in all, the community can't afford to lose any of you two Julian and
Vladimir.
Have great holidays.



On Thu, Dec 23, 2021 at 4:29 AM Jacques Nadeau  wrote:

> This is a hard message to respond to. Hopefully my comments will reduce the
> friction for others to share their own observations. Something that I will
> reiterate here is that Julian posted this on a public forum which is a
> request to not only get feedback from committers and PMC members but the
> community at large. To help facilitate this, I would suggest that more
> active project members only make a single response to this to allow "space"
> for others to provide their feedback.
>
> Firstly, at least for me, it is a hard time of year to have this kind of
> nuanced & sensitive conversation since I'm doing a lot of holiday family
> stuff in the coming weeks.
>
> Secondly, having recently re-engaged in this project, I've written several
> emails in a similar vein to Julian's and then each time decided not to send
> it. If I felt like saying this multiple times and Julian sat down and
> actually sent it, I would guess that others had similar questions/concerns.
>
> ## General observations:
> - I think Julian and Vladimir are far and above the most prolific
> contributors on the project (by any measure I can use: code, commits, email
> responses, longevity, consistency, releases, etc). [3]
> - I think Julian and Vladimir typically have the strongest opinions. (This
> is pretty standard, the most prolific contributors typically have the
> strongest opinions.)
> - I think we have a general challenge around power asymmetry (people with
> power aren't always aware of how this power influences others).
> - I think Julian and Vladimir are the two project titans. People frequently
> defer to them. When they go "mano a mano", it is challenging as there is
> less deference and more conflict. It also can devolve.
>
> ## Challenging Behaviors
> I see a few specific patterns occur frequently that I think are challenging
> and we should strive to correct. These include:
>
> 1. Presenting traditions, personal preferences or opinions as "rules". [2]
> 2. Being a dog with a bone (responding quickly and repeatedly to everyone
> who has a differing opinion).
> 3. General disrespectfulness (dismissive of others' perspective and/or
> opinions, condescending behavior, unilateral action).
>
> I think all three of these behaviors are net-negative for the community.
>
> ## Goals
> I think a few of things would help mitigate these challenges:
>
> A. For people in power, remember that your opinions are outsized. Apply
> them only where you think it is absolutely critical to the overall health
> of the community. You may have strong opinions about simple things. Do your
> best 

Re: [ANNOUNCE] New committer: Vladimir Ozerov

2021-06-24 Thread Albert
Congrats.
just found the blog: https://www.querifylabs.com/author/vladimir-ozerov

On Thu, Jun 24, 2021 at 2:27 PM Alessandro Solimando <
alessandro.solima...@gmail.com> wrote:

> Congratulations Vladimir, well deserved, I had the chance to read some
> of the blog posts and I have appreciated them very much.
>
> Best regards,
> Alessandro
>
> On Thu, 24 Jun 2021 at 07:58, Viliam Durina  wrote:
> >
> > Congratulations!
> >
> > Viliam
> >
> > On Thu, 24 Jun 2021 at 06:58, Forward Xu  wrote:
> >
> > > Congratulations!
> > >
> > >
> > > Best,
> > >
> > > Forward
> > >
> > > Danny Chan  于2021年6月24日周四 上午11:51写道:
> > >
> > > > Congrats, Vladimir!
> > > >
> > > > Best,
> > > > Danny Chan
> > > >
> > > > Yanjing Wang  于2021年6月24日周四 上午11:41写道:
> > > >
> > > > > Congrats, Vladimir!
> > > > >
> > > > > Roman Kondakov  于2021年6月24日周四
> 上午11:22写道:
> > > > >
> > > > > > Congratulations, Vladimir!
> > > > > >
> > > > > > Roman Kondakov
> > > > > >
> > > > > > On 24.06.2021 12:23, 段雄 wrote:
> > > > > > > Congratulations!
> > > > > > >
> > > > > > > XING JIN  于2021年6月24日周四 上午10:21写道:
> > > > > > >
> > > > > > >> Congratulations ~
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Jin
> > > > > > >>
> > > > > > >> guangyuan wang  于2021年6月24日周四
> 上午9:50写道:
> > > > > > >>
> > > > > > >>> Congratulations!
> > > > > > >>>
> > > > > > >>> Francis Chuang  于2021年6月24日周四
> > > 上午6:39写道:
> > > > > > >>>
> > > > > >  Congrats, Vladimir!
> > > > > > 
> > > > > >  Francis
> > > > > > 
> > > > > >  On 24/06/2021 7:48 am, Haisheng Yuan wrote:
> > > > > > > Congratulations and thanks for your contributions,
> Vladimir!
> > > > > > >
> > > > > > > Regards,
> > > > > > > Haisheng
> > > > > > >
> > > > > > > On 2021/06/23 21:34:40, Stamatis Zampetakis <
> zabe...@gmail.com
> > > >
> > > > > > >> wrote:
> > > > > > >> Apache Calcite's Project Management Committee (PMC) has
> > > invited
> > > > > > >>> Vladimir
> > > > > > >> Ozerov to
> > > > > > >> become a committer, and we are pleased to announce that
> he has
> > > > > > >>> accepted.
> > > > > > >>
> > > > > > >> Vladimir is among the few people who know very well the
> > > internal
> > > > > >  workings
> > > > > > >> of the
> > > > > > >> Calcite optimizer. He started and participated in many
> > > > discussions
> > > > > > >>> about
> > > > > > >> the core engine and contributed ideas and code for making
> it
> > > > > better.
> > > > > > >> Moreover, Vladimir has blogged and talked about Calcite in
> > > > various
> > > > > > >> conferences and meetups giving publicity and showcasing
> the
> > > > > >  capabilities of
> > > > > > >> the project.
> > > > > > >>
> > > > > > >> Vladimir, welcome, thank you for your contributions, and
> we
> > > look
> > > > > >  forward to
> > > > > > >> your
> > > > > > >> further interactions with the community! If you wish,
> please
> > > > feel
> > > > > > >> free
> > > > > >  to
> > > > > > >> tell
> > > > > > >> us more about yourself and what you are working on.
> > > > > > >>
> > > > > > >> Stamatis (on behalf of the Apache Calcite PMC)
> > > > > > >>
> > > > > > 
> > > > > > >>>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Viliam Durina
> > Jet Developer
> >   hazelcast®
> >
> >    2 W 5th Ave, Ste 300 | San Mateo, CA
> 94402 |
> > USA
> > +1 (650) 521-5453 | hazelcast.com 
> >
> > --
> > This message contains confidential information and is intended only for
> the
> > individuals named. If you are not the named addressee you should not
> > disseminate, distribute or copy this e-mail. Please notify the sender
> > immediately by e-mail if you have received this e-mail by mistake and
> > delete this e-mail from your system. E-mail transmission cannot be
> > guaranteed to be secure or error-free as information could be
> intercepted,
> > corrupted, lost, destroyed, arrive late or incomplete, or contain
> viruses.
> > The sender therefore does not accept liability for any errors or
> omissions
> > in the contents of this message, which arise as a result of e-mail
> > transmission. If verification is required, please request a hard-copy
> > version. -Hazelcast
>


-- 
~~~
no mistakes
~~


Re: Proposal to extend Calcite into a incremental query optimizer

2021-05-14 Thread Albert
>>>>>>
> >>>>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-
> 7aDLnCdKKXJN1o/edit?usp=sharing
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Botong
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Apr 5, 2021 at 3:59 PM Julian Hyde <
> >>>>> jhyde.apa...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> I have added my time preferences to the doc.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Before we meet, could you publish a PR for us to review?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Initial discussions will need to be about architecture and
> >>>>>>> high-level
> >>>>>>>>>>>> design. So I would ask Calcite reviewers not to review the
> >> PR
> >>>>>>>> line-by-line
> >>>>>>>>>>>> (or to leave comments in GitHub) but try to understand the
> >>>>> design
> >>>>>>>>>>>> holistically, and prepare questions/comments before the
> >>>> meeting.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Botong, Can you please create a Calcite JIRA case for this
> >>>> task?
> >>>>>>> JIRA
> >>>>>>>>>>>> how we track long-running tasks such as this.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Julian
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> On Apr 3, 2021, at 5:15 PM, Botong Huang <
> >> pku...@gmail.com
> >>>>>
> >>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Apology for the delay. It took us some time to clean up
> >> our
> >>>>> code
> >>>>>>>> base
> >>>>>>>>>>>> and
> >>>>>>>>>>>>> publicly release it (which will be out soon) for a quick
> >>>> peek.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> We are ready to present our work. Let's schedule a time
> >>>> for a
> >>>>> Zoom
> >>>>>>>>>>>>> meeting and discuss how to integrate Tempura into
> >> Calcite.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Since some of our team members are in China, we prefer
> >> the
> >>>>> time
> >>>>>>> slot
> >>>>>>>>>>>> of
> >>>>>>>>>>>>> 7:00pm-11:30pm PST any day. I've added our time
> >> preference
> >>>> in
> >>>>> the
> >>>>>>>>>>>> shared
> >>>>>>>>>>>>> doc below.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-
> 7aDLnCdKKXJN1o/edit?usp=sharing
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> We encourage everyone to add their time preferences
> >> (during
> >>>>>>>>>>>> 04/15-04/30) in
> >>>>>>>>>>>>> this doc. In a week or so, we will try to settle a time
> >>>> that
> >>>>> works
> >>>>>>>> for
> >>>>>>>>>>>>> most.
> >>>>>>>>>>>

Re: Proposal to extend Calcite into a incremental query optimizer

2021-01-27 Thread Albert
it would be very nice to see a POC of your work.


On Thu, Jan 28, 2021 at 10:21 AM Botong Huang  wrote:

> Hi Julian,
>
> Just wondering if there are any updates? We are wondering if it would help
> to post our code for a quick preview.
>
> Thanks,
> Botong
>
> On Fri, Jan 1, 2021 at 11:04 AM Botong Huang  wrote:
>
> > Hi Julian,
> >
> > Thanks for your interest! Sure let's figure out a plan that best benefits
> > the community. Here are some clarifications that hopefully answer your
> > questions.
> >
> > In our work (Tempura), users specify the set of time points to consider
> > running and a cost function that expresses users' preference over time,
> > Tempura will generate the best incremental plan that minimizes the
> overall
> > cost function.
> >
> > In this incremental plan, the sub-plans at different time points can be
> > different from each other, as opposed to identical plans in all delta
> runs
> > as in streaming or IVM. As mentioned in $2.1 of the Tempura paper, we can
> > mimic the current streaming implementation by specifying two (logical)
> time
> > points in Tempura, representing the initial run and later delta runs
> > respectively. In general, note that Tempura supports various form of
> > incremental computing, not only the small-delta append-only data model in
> > streaming systems. That's why we believe Tempura subsumes the current
> > streaming support, as well as any IVM implementations.
> >
> > About the cost model, we did not come up with a seperate cost model, but
> > rather extended the existing one. Similar to multi-objective
> optimization,
> > costs incurred at different time points are considered different
> > dimensions. Tempura lets users supply a function that converts this cost
> > vector into a final cost. So under this function, any two incremental
> plans
> > are still comparable and there is an overall optimum. I guess we can go
> > down the route of multi-objective parametric query optimization instead
> if
> > there is a need.
> >
> > Next on materialized views and multi-query optimization, since our
> > multi-time-point plan naturally involves materializing intermediate
> results
> > for later time points, we need to solve the problem of choosing
> > materializations and include the cost of saving and reusing the
> > materializations when costing and comparing plans. We borrowed the
> > multi-query optimization techniques to solve this problem even though we
> > are looking at a single query. As a result, we think our work is
> orthogonal
> > to Calcite's facilities around utilizing existing views, lattice etc. We
> do
> > feel that the multi-query optimization component can be adopted to wider
> > use, but probably need more suggestions from the community.
> >
> > Lastly, our current implementation is set up in java code, it should be
> > straightforward to hook it up with SQL shell.
> >
> > Thanks,
> > Botong
> >
> > On Mon, Dec 28, 2020 at 6:44 PM Julian Hyde 
> > wrote:
> >
> >> Botong,
> >>
> >> This is very exciting; congratulations on this research, and thank you
> >> for contributing it back to Calcite.
> >>
> >> The research touches several areas in Calcite: streaming, materialized
> >> view maintenance, and multi-query optimization. As we have already some
> >> solutions in those areas (Sigma and Delta relational operators, lattice,
> >> and Spool operator), it will be interesting to see whether we can make
> them
> >> compatible, or whether one concept can subsume others.
> >>
> >> Your work differs from streaming queries in that your relations are used
> >> by “external” user queries, whereas in pure streaming queries, the only
> >> activity is the change propagation. Did you find that you needed two
> >> separate cost models - one for “view maintenance” and another for “user
> >> queries” - since the objectives of each activity are so different?
> >>
> >> I wonder whether this work will hasten the arrival of multi-objective
> >> parametric query optimization [1] in Calcite.
> >>
> >> I will make time over the next few days to read and digest your paper.
> >> Then I expect that we will have a back-and-forth process to create
> >> something that will be useful for the broader community.
> >>
> >> One thing will be particularly useful: making this functionality
> >> available from a SQL shell, so that people can experiment with this
> >> functionality without writing Java code or setting up complex databases
> and
> >> metadata. I have in mind something like the simple DDL operations that
> are
> >> available in Calcite’s ’server’ module. I wonder whether we could devise
> >> some kind of SQL syntax for a “multi-query”.
> >>
> >> Julian
> >>
> >> [1]
> >>
> https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext
> >>
> >>
> >>
> >> > On Dec 23, 2020, at 8:55 PM, Botong Huang  wrote:
> >> >
> >> > Thanks Aron for pointing this out. To see the figure, please refer to
> >> Fig
> >> > 3(a) in our paper:
> >> 

Re: [DISCUSS] Does anybody think this is debuging unfriendly: "call.transformTo(relBuilder.build())"

2020-11-23 Thread Albert
+1, `step into` causes bad debug experience.

On Mon, Nov 23, 2020 at 6:13 PM JiaTao Tao  wrote:

> I can create a JIRA and update the code, it's minor but I think it is good
> for us.
>
>
> Regards!
>
> Aron Tao
>
>
> Haisheng Yuan  于2020年11月23日周一 下午5:53写道:
>
> > Agree with Jiatao, I had the same experience and feeling. But it mainly
> > depends on the rule creator's preference.
> >
> > On 2020/11/23 02:42:21, Danny Chan  wrote:
> > > I kind of agree, but it's more like a programming specification, we can
> > > tell people how to write codes but they may not follow those rules.
> > >
> > > JiaTao Tao  于2020年11月22日周日 下午5:27写道:
> > >
> > > > Why I don't want to debug into "transformTo":
> > > >
> > > > 1. It's a common method, if you directly stop here, every rule will
> > stop,
> > > > or you must stop the specific rule, then step into this method call,
> > it's
> > > > one more step.
> > > > 2. There are many contexts in the rule, if you debug into
> > "transformTo",
> > > > you have to go back to see these.
> > > >
> > > >
> > > > Regards!
> > > >
> > > > Aron Tao
> > > >
> > > >
> > > > JiaTao Tao  于2020年11月22日周日 下午5:23写道:
> > > >
> > > > > Hi
> > > > > I've been developed Calcite full time for a quite long time, and I
> > ofter
> > > > > debug in the rule to see the transformations, but code like this is
> > not
> > > > > debuging friendly in my opinion:
> > "call.transformTo(relBuilder.build())"
> > > > >
> > > > > I want to see the relBuilder.build()'s result, I have to debug into
> > the
> > > > > "transformTo" method(you can not evaluate "relBuilder.build()" cuz
> > it's a
> > > > > stack), if we split this into two lines, we can just stop at the
> last
> > > > link:
> > > > >
> > > > > RelNode ret = relBuilder.build()
> > > > > call.transformTo(ret)
> > > > >
> > > > > It's not a big deal, but every time I occur this, it has poor
> > > > experience, hope
> > > > > to hear the community's opinion.
> > > > >
> > > > > Regards!
> > > > >
> > > > > Aron Tao
> > > > >
> > > >
> > >
> >
>


-- 
~~~
no mistakes
~~


Re: [DISCUSSION] Rename master branch to main

2020-08-04 Thread Albert
I am against this change -1. (probably vain anyways)
 @Michael Mior   yes there are people against it.

nobody expressed any gain from this.
nobody required this change.
This change expressed a thinking in `incorrect` direction. (personal
opinion)

On Wed, Jul 29, 2020 at 10:31 PM Michael Mior  wrote:

> I don't want to get too into the weeds here since there hasn't been
> any strong opposition and it seems like this is a change everyone
> (some perhaps reluctantly) are ok with moving forward with. But a
> couple comments:
>
> Has anyone expressed concern? No, but given that main is explicitly
> neutral, I'd rather not place the burden on people who may find
> terminology offensive to raise the issue.
> Do I think about slavery whenever I merge into master? No, but my
> ancestors were also not owned as slaves.
>
> I've opened CALCITE-4147 to track what needs to be done for this.
>
> https://issues.apache.org/jira/browse/CALCITE-4147
>
> --
> Michael Mior
> mm...@apache.org
>
>
> Le mer. 29 juil. 2020 à 05:55, Ruben Q L  a écrit :
> >
> > Hi all,
> >
> > I would not oppose the renaming, but I must say that I agree with Danny
> > Chan here. Is this really an issue? Is there any official guideline from
> > the ASF about this topic? Has anyone in the Calcite community truly
> > expressed any concern about the master branch being called "master"? Do
> you
> > really think of slavery whenever you "merge into master", or whenever you
> > use the term "master" in this context?
> >
> > I could understand renaming a "master-slave" architecture into something
> > different, since that is clearly a slavery-related terminology. But, as
> > other people have already said, not every usage of the word "master" has
> > this connotation. Honestly I see no problem in having a "master branch"
> > because, in my opinion, it is clear that when we talk about it we mean
> the
> > "reference branch", "principal branch" or (quoting the Merriam-Webster
> > dictionary) the "original from which copies can be made".
> >
> > Maybe I am wrong here, but I have the impression that we are fixing an
> > artificial problem that does not actually exist. If tomorrow someone on
> > Twitter says that the term "class" is offensive because it has some
> marxist
> > connotations, should we rewrite all our Java code? This is an extreme,
> > stupid example (I hope, although nowadays you never know), but I think
> you
> > know where I am going with my logic...
> >
> > We need to fight racism but IMHO this is not how to do it.
> >
> > Best,
> > Ruben
> >
> >
> > Le mer. 29 juil. 2020 à 06:54, Francis Chuang 
> a
> > écrit :
> >
> > > I am also +1 for this change.
> > >
> > > - It's a simple change that doesn't require a lot of effort and
> > > disruption to the code base.
> > > - If we follow the links from the article Michael posted, the term
> > > "master" in git does not originate from "master record" but rather from
> > > master/slave.
> > > - We make our community more welcoming, diverse and inclusive by
> > > switching to a term that is more inclusive.
> > > - Sometimes a new word can be more self-explanatory. Recently
> > > "blacklist" and "whitelist" was replaced in the Go source code with
> > > "allowlist" and "blocklist" [1] as a case in point.
> > >
> > > Francis
> > >
> > > [1] https://go-review.googlesource.com/c/go/+/236857/
> > >
> > > On 29/07/2020 12:30 pm, Matt Burgess wrote:
> > > > Hi all,
> > > >
> > > > I'm a Calcite user and longtime mailing list lurker :) I'd like to
> > > > share our experience from Apache NiFi, we started such a discussion
> > > > for NiFi based on existing discussions from Apache Yetus and Apache
> > > > Accumulo [1]. Our own discussion continued (please see the linked
> > > > email thread) but I believe our community came to a similar consensus
> > > > as the Calcite community (and others), that whatever notions were
> > > > educed from the terms, it is more welcoming and purposeful to change
> > > > them for the best community experience. The impact to the codebase
> was
> > > > minimal and non-breaking, so we came together to perform the few
> steps
> > > > we needed to rename the default branch and search the code for terms
> > > > we could simply find-and-replace, plus we updated the Developer
> Guide.
> > > > Since then, we haven't seen much in the way of confusion or missteps
> > > > in our development process. Everyone seems to have taken the changes
> > > > in stride, updated what they needed to, and continued with their
> > > > contributions, all the while providing a better atmosphere for even
> > > > better things to come.
> > > >
> > > > Regards,
> > > > Matt
> > > >
> > > > [1]
> > >
> http://mail-archives.apache.org/mod_mbox/nifi-dev/202006.mbox/%3cCA+LyY55Mb8xZ35W_9UM=ter+gt_1azhgxmbpdn9edbssnv-...@mail.gmail.com%3e
> > > >
> > > > On Tue, Jul 28, 2020 at 9:55 PM Danny Chan 
> wrote:
> > > >>
> > > >> As a Chinsese, I didn’t understand quite well why the word “master”
> can
> > > be “slavery”. 

Re: Question of Calcite Dynamic Code Generation Feature

2020-05-14 Thread Albert
like Haisheng mentioned, it is certainly debuggable, although not as
convenient as plain java code.

my understanding is that, the java code generation is merely a
demonstration of the physical plan implementation.
there are all kinds of ways to implement the logical plan, java code is
merely a `helloworld` of that.  you can see many adapters does that
differently.

On Thu, May 14, 2020 at 2:54 AM 徐泷泽 <15258826...@qq.com> wrote:

> Hi buddies !
> Our team is working on developing a mulit data source sql engine use
> Calcite.But something confuse us, why calcite generate Java code
> dynamicly on SQL query, it seen have performance issues in my opinion.
> And building dynamic code object is very hard,the code is unreadable and
> hard to understood, futhermore we can not debug it.
>
>
> i wander why it was designed like that, what advantage of that .we
> search on internet,
> but nobody can realy explain it(may be we not goot at search, and poor
> english).
> so anybody can tell me the history of this, or give us some article.
>
>
> applogize for my poor english, hope you know what i'm taking aboult.
> thank you very very very much !



-- 
~~~
no mistakes
~~


Re: stored procedure

2020-05-10 Thread Albert
pig latin is the closest as I am aware of, refer to the pig adapter in
project folder.

On Mon, May 11, 2020 at 4:35 AM Lekshmi  wrote:

> Hi,
>   As part of my research project, we would like to use the stored
> procedure. do you have any pointers for the implementation of stored
> procedures or any alternatives in Calcite?
>
> Thanks and Regards
>
> Lekshmi B.G
> Email: lekshmib...@gmail.com
>


-- 
~~~
no mistakes
~~


Re: [DISCUSS] Remove Kotlin

2019-12-17 Thread Albert
I've used the new version calcite with new version of IntelliJ, everything
works. I like that.
I can see valadmir put some efforts in this, I respect that. and all effort
put in to the codebase should be respected.
from my side, I don't contribute as much now, but occasionally I would look
at the new stuff added so as long I can REPL the code I am okay with it.
as for 'kotlin', like when it was first brought up in the calcite mail
thread, I am curious about that and would be willing to learn more.



On Wed, Dec 18, 2019 at 7:45 AM Michael Mior  wrote:

> Le mar. 17 déc. 2019 à 15:26, Vladimir Sitnikov
>  a écrit :
> >
> > Vladimir>Quidem, CalciteAssert
> > Michael>If you want to propose removing either of these, we could have a
> > Michael>discussion about it, but you're talking about code which is
> already
> > Michael>heavily used throughout Calcite.
> >
> > The point of "we assume contributors are good at Java, thus we must keep
> > the code to be Java-only" is weak.
> > New contributors will likely see Quidem and CalciteAssert for the first
> > time, and Java knowledge does not help there.
> >
>
> I didn't make that point. Those are you words.
>
> > It does not imply that languages like Quidem and/or CalciteAssert are a
> bad
> > fit for their job, but it is wrong to judge
> > based solely on "it is not Java".
> >
> > Michael>The consensus from the discussion you started seems to be that
> > Michael>Kotlin should not be added to the tests
> >
> > It is not like that.
>
> I counted at least 5 different contributors stating they did not think
> Kotlin should be introduced into test code. You seemed to be the only
> one in the discussion strongly promoting it. If that's not consensus,
> I must have misinterpreted the discussion.
>
> >
> > Michael>I agree that for these specific tests, readability is improved
> >
> > That is exactly my point. There's an improvement, the downsides are
> small,
> > so I just committed it.
> >
> > Michael>But many tests require more than this
> >
> > That is to be discussed on a test by test basis (or use-case by
> use-case).
> > For instance, strings (especially, multi-line ones) with $ is an issue
> for
> > Kotlin for now.
> >
> > Vladimir
>


-- 
~~~
no mistakes
~~


Re: Re: [ANNOUNCE] Haisheng Yuan joins Calcite PMC

2019-11-11 Thread Albert
congrats, well deserved.



On Mon, Nov 11, 2019 at 2:26 PM Wang Yanlin <1989yanlinw...@163.com> wrote:

> Congrats Haisheng! Well deserved!--
>
> Best,
> Wang Yanlin
>
>
>
> 在 2019-11-11 14:09:58,"Leonard Xu"  写道:
> >Congratulations! HaiSheng
> >
> >Best,
> >Leonard Xu
> >> On 2019年11月11日, at 下午1:22, Danny Chan  wrote:
> >>
> >> Congratulations HaiSheng!
> >
>


-- 
~~~
no mistakes
~~


Re: [QUESTION] One query executed on two or more different data storages through Calcite

2019-10-22 Thread Albert
my intuition is leave the plans alone. when actually execute the TABLESCAN
PLAN, send it to two targets.
refer to the csv example.

On Tue, Oct 22, 2019 at 9:26 PM Juan Pan  wrote:

> Hi Danny and Julian
>
>
> Thanks, i did some researches after listened to your suggestions. It seems
> not an easy thing for me, but i will learn Calcite and Flink more and think
> about your thoughts.
>
>
> Best wishes,
> Trista
>
>
>  Juan Pan
>
>
> panj...@apache.org
> Juan Pan(Trista), Apache ShardingSphere
>
>
> On 10/22/2019 13:42,Danny Chan wrote:
> You may need a computation framework like Apache Flink. Use MySQL and
> Cassandra as connector/dataSource and write the results to your sink.
>
> Best,
> Danny Chan
> 在 2019年10月22日 +0800 AM10:36,Juan Pan ,写道:
> Hi everyone,
>
>
> Thanks for your attention. I can not get a clear result after read most of
> Calcite document. So i send this email for your suggestion.
>
>
> Suppose there are two data storages, e.g, MySQL and Cassandra behind
> Calcite, and data is separately stored in two of them, can i execute a
> query, e.g `SELECT * FROM tb WHERE id = 1` simultaneously on two of data
> storages through Calcite? In other words, i want to get the final combined
> result from MySQL and Cassandra, which store part of data in different
> forms separately through Calcite.
>
>
> Looking forward to your suggestions and thoughts.
>
>
> Best wishes,
> Trista
>
>
> Juan Pan
>
>
> panj...@apache.org
> Juan Pan(Trista), Apache ShardingSphere
>
>

-- 
~~~
no mistakes
~~


Re: [DISCUSS] Small contributions

2019-09-27 Thread Albert
I really liked the compact commit log messages in calcite repo.
on the one hand contributions should be encouraged, on the other hand I
don't want to see commit log jammed with 'fix typo' stuff. let's hope that
won't be the case.

On Sat, Sep 28, 2019 at 4:27 AM Julian Feinauer <
j.feina...@pragmaticminds.de> wrote:

> Yes, I totally agree that's a major change by any means. As Julian pointed
> out above its only about non-code changes.
>
> Julian
> 
> From: Andrei Sereda 
> Sent: Friday, September 27, 2019 7:25:56 PM
> To: dev@calcite.apache.org 
> Subject: Re: [DISCUSS] Small contributions
>
> I presume 3rd party library upgrades should go through regular process
> (jira/PR etc.) ?
>
> Dependency upgrade is not considered  "small change" since impact is
> greater than just a "typo fix".
>
>
> On Thu, Sep 26, 2019 at 1:47 PM Julian Hyde  wrote:
>
> > A few points.
> >
> > I don’t like the term “hot fix”. A hot fix has an existing meaning[1] -
> it
> > is a patch you apply to your binaries. Let’s not use that term.
> >
> > Let’s define “small contributions” as contributions that do not modify
> > code and therefore will not break anything, do not need a test or
> > documentation change, and do not need a CI run.
> >
> > I am in favor of accepting small contributions. I wasn’t previously.
> >
> > We can have guidelines about how to label these small contributions (e.g.
> > git labels, certain words in the commit message or PR description). But
> we
> > shouldn’t expect or require contributors to follow those guidelines. By
> > their nature, these contributors have not had time to read all of our
> > policy documents.
> >
> > Reviewers must know what our policy is, and should massage commit
> messages
> > tot conform to policy.
> >
> > These kinds of changes are, by definition, very small and simple. A
> > committer can review, approve, fix up, and push to master, and close the
> PR
> > in one go. Five minutes. If the PR requires a back-and-forth then it is
> not
> > a “simple” change.
> >
> > We should not require a JIRA case.
> >
> > We not apply the usual policy of appending the contributor’s name to the
> > commit message. A typical commit message would be “Fix a comment”.
> >
> > Release manager should remove these kinds of trivial changes from the
> > release notes. They add nothing to the release notes.
> >
> > These kinds of changes do earn “merit” - the basis on which we make
> people
> > committers - but they earn less merit than a bug fix, a new feature, a
> > detailed response to a question on the dev list, or a conference talk. I
> > don’t want people to believe that they can earn committership by fixing
> 100
> > typos.
> >
> > There can be problems if a community over-relies on small PRs. In
> > particular, there is a project in the Incubator that has only one or two
> > regular developers but receives hundreds of contributions a few lines
> long
> > via PRs. The discussion occurs in the PRs, and contributors rarely make
> > more than 1 or 2 contributions. The problem for the project is that there
> > is no emergent “community”. This is a serious problem for that project,
> and
> > obviously we do not have that problem. Still, there is a side effect to
> the
> > back-and-forth discussion to get a change accepted, namely that the
> > individuals get to know each other. We don’t want to lose that.
> >
> >
> > Julian
> >
> > [1] https://en.wikipedia.org/wiki/Hotfix <
> > https://en.wikipedia.org/wiki/Hotfix>
> >
> >
> >
> >
> > > On Sep 26, 2019, at 5:17 AM, Michael Mior  wrote:
> > >
> > > I thought about a label, but I think it's probably more productive to
> > > just review the change immediately if it really is something trivial.
> > > The problem is that labels can only be applied by committers. That's
> > > why I suggested asking those who submit PRs to include something in
> > > the PR title. If others think a label would help though, I'm not
> > > opposed to it.
> > > --
> > > Michael Mior
> > > mm...@apache.org
> > >
> > > Le jeu. 26 sept. 2019 à 07:28, TANG Wen-hui
> > >  a écrit :
> > >>
> > >> I agree that we should accept these small changes but not create JIRA
> > for them.
> > >> In my opinion, maybe we can label the PR of these small changes.  And
> > process them at regular intervals in case of forgetting.
> > >>
> > >> best,
> > >> --
> > >> wenhui
> > >>
> > >>
> > >>
> > >> winifred.wenhui.t...@gmail.com
> > >>
> > >> From: Haisheng Yuan
> > >> Date: 2019-09-26 10:17
> > >> To: Francis Chuang; dev@calcite.apache.org (dev@calcite.apache.org)
> > >> Subject: Re: Re: [DISCUSS] Small contributions
> > >>> most of the time, the author of the fix would  have moved on and have
> > >> forgotten about it, resulting in the improvement falling through the
> > cracks.
> > >>
> > >> Make sense. I think our current position worth reconsidering and I
> > >> agree with Francis.
> > >>
> > >> - Haisheng
> > >>
> > >> 

Re: Calcite swag

2019-08-12 Thread Albert
Nice

在 2019年8月13日星期二,Danny Chan  写道:

> Nice t-shirt, I like the black one ~
>
> Best,
> Danny Chan
> 在 2019年8月12日 +0800 PM8:49,Michael Mior ,写道:
> > For anyone who may be interested, Calcite swag now available at the
> > link below. You'll see a link there to view the logo on other products
> > besides the t-shirt.
> >
> > https://www.redbubble.com/people/comdev/works/40515208-
> apache-calcite?p=t-shirt
> > --
> > Michael Mior
> > mm...@apache.org
>


-- 
~~~
no mistakes
~~


Re: Filterable table

2019-08-11 Thread Albert
talking about this class reminds me of my own usage of
ProjectableFilterable. it works.
but this feels like a very bad design, what if I want to push down
Aggregation ? AggregatableProjectableFilterable?

I don't know the intention of the initial design, but this just doesn't
feel right.

On Mon, Aug 12, 2019 at 9:35 AM Danny Chan  wrote:

> There is a ProjectableFilterableTable[1] that can be both used for filter
> and projects push down, have a try :)
>
> [1]
> https://github.com/apache/calcite/blob/996cdd0249a4e5086fadc27a15ccc23eb9defd8e/core/src/main/java/org/apache/calcite/schema/ProjectableFilterableTable.java#L36
>
> Best,
> Danny Chan
> 在 2019年8月12日 +0800 AM7:55,Lekshmi ,写道:
> > Hi Danny Chan,
> > Thank you so much for responding to my email. Can we use, Translatable
> > table instead of Filterable table? which is more efficient? Usecase that
> if
> > need to write a Project push down (Push down Project fields into table
> scan
> > then do Filtering) and Filter push down?
> > Or Filterable table is enough to use Project push down as well?
> > Thanks and Regards
> >
> > Lekshmi B.G
> > Email: lekshmib...@gmail.com
> >
> >
> >
> >
> > On Fri, Aug 9, 2019 at 11:15 AM Danny Chan  wrote:
> >
> > > Hi, Lekshmi ~
> > >
> > >
> > > The FilterableTable can be scanned with a list of filter expression,
> that
> > > means the filter conditions above the project + scan can be
> > > pushed directly into the table scan[1][2], which is more efficient.
> Here is
> > > a CsvFilterableTable example [3].
> > >
> > > [1]
> > >
> https://github.com/apache/calcite/blob/602b2a10e3f81888c0759ca69f7415497124ef82/core/src/main/java/org/apache/calcite/rel/rules/FilterTableScanRule.java#L130
> > > [2]
> > >
> https://github.com/apache/calcite/blob/602b2a10e3f81888c0759ca69f7415497124ef82/core/src/main/java/org/apache/calcite/rel/rules/ProjectTableScanRule.java#L128
> > > [3]
> > >
> https://github.com/apache/calcite/blob/602b2a10e3f81888c0759ca69f7415497124ef82/example/csv/src/main/java/org/apache/calcite/adapter/csv/CsvFilterableTable.java#L65
> > >
> > > Best,
> > > Danny Chan
> > > 在 2019年8月8日 +0800 PM10:56,dev@calcite.apache.org,写道:
> > > >
> > > > Filterable
> > >
>


-- 
~~~
no mistakes
~~


Re: Is there a document that can help understand Calcite's code generation?

2019-08-04 Thread Albert
the way worked for me is to run through some rather simple SQL queries.
nothing unusual though, it is really just a big block of `if else` stuff to
concatenate java code represented
by strings, which probably could be better represented using freeMarker.

On Thu, Aug 1, 2019 at 9:00 PM Muhammad Gelbana  wrote:

> Hopefully there is something I can read to help me go through the code
> generation module of Calcite because frankly it's not easy to understand.
> It's an extremely powerful feature and I need to be able to use it and fix
> it if I have to.
>


-- 
~~~
no mistakes
~~


Re: Giving the Calcite logo some love

2019-06-04 Thread Albert
I will vote "logo-alt1-v5.svg
"
, looks nice.

On Wed, Jun 5, 2019 at 6:11 AM Stamatis Zampetakis 
wrote:

> I created a branch to gather all alternative logos [1].
>
> Among the two aforementioned proposals, I added a few more variants (check
> logo-alt* under the img directory).
> Have a look and let me know what you think. I'm open to any ideas and
> suggestions.
>
> [1] https://github.com/zabetak/calcite/tree/calcite-logo/site/img
>
> On Tue, Jun 4, 2019 at 7:25 PM Julian Hyde  wrote:
>
> > I prefer both over the current logo. (And I made the current logo.)
> >
> > Let's keep the discussion going, and get to a new logo.
> >
> > On Tue, Jun 4, 2019 at 9:42 AM Ivan Grgurina 
> wrote:
> > >
> > > I prefer the one Daniel sent. It looks cleaner, but maybe the "periodic
> > table" logo can be made better by simplifying it, the shadow on the
> letter
> > C is... unusual.
> > >
> > > 
> > > From: Stamatis Zampetakis 
> > > Sent: Tuesday, June 4, 2019 6:32 PM
> > > To: dev@calcite.apache.org; humbed...@apache.org
> > > Subject: Re: Giving the Calcite logo some love
> > >
> > > Thanks for digging this out Daniel!
> > >
> > > At this point we have two candidates:
> > > http://humbedooh.com/calcite-proposed.svg
> > > https://svgshare.com/s/86r
> > >
> > > Do we like any of above more than our current logo (the way they are or
> > > with slight modifications) ?
> > >
> > >
> > >
> > >
> > > On Mon, Jun 3, 2019 at 3:15 AM Yuzhao Chen 
> wrote:
> > >
> > > > Oh, I see a big hammer, thanks Daniel !
> > > >
> > > > Best,
> > > > Danny Chan
> > > > 在 2019年6月3日 +0800 AM6:21,Daniel Gruno ,写道:
> > > > > Found it!
> > > > >
> > > > > http://humbedooh.com/calcite-proposed.svg
> > > > >
> > > > > Thanks, Wayback Machine!
> > > > >
> > > > > On 19/05/2019 10.01, Stamatis Zampetakis wrote:
> > > > > > Hi all,
> > > > > >
> > > > > > We started this discussion about a year ago and many people were
> > > > > > positive with the idea of having a new logo for Calcite.
> > > > > > We had some nice proposals at the time and maybe now somebody
> else
> > has
> > > > > > also new ideas/designs to contribute.
> > > > > >
> > > > > > @Daniel: Is there any chance that you still have the logos you
> > proposed
> > > > > > somewhere available? The old link [1] does not work anymore.
> > > > > >
> > > > > > Best,
> > > > > > Stamatis
> > > > > >
> > > > > > [1]
> > > >
> > http://www.apache.org/logos/comdev-test/res/calcite/calcite-proposed.svg
> > > > > >
> > > > > >
> > > > > > On Thu, Aug 30, 2018 at 12:02 AM Stamatis Zampetakis <
> > > > zabe...@gmail.com
> > > > > > > wrote:
> > > > > >
> > > > > > Vladimir>Could you flip rhombus so it goes right-up?
> > > > > >
> > > > > > https://svgshare.com/s/86r
> > > > > >
> > > > > > I hope this is what you meant.
> > > > > >
> > > > > > Best,
> > > > > > Stamatis
> > > > > >
> > > > > > Στις Τετ, 29 Αυγ 2018 στις 6:30 μ.μ., ο/η Michael Mior
> > > > > > mailto:mm...@apache.org>> έγραψε:
> > > > > >
> > > > > > Just a note since we're on the topic that whatever logos we come
> > > > > > up with
> > > > > > should be sure to have TM clearly indicated.
> > > > > >
> > > > > > --
> > > > > > Michael Mior
> > > > > > mm...@apache.org 
> > > > > >
> > > > > >
> > > > > >
> > > > > > Le mer. 29 août 2018 à 03:13, Julian Hyde
> > > > > > mailto:jhyde.apa...@gmail.com>> a
> écrit :
> > > > > >
> > > > > > > Yes indeed!
> > > > > > >
> > > > > > > If someone feels inspired to produce a logo, here’s my
> > > > > > suggestion of a
> > > > > > > theme/image: a spider, specifically a Barn Spider (Araneus
> > > > > > Cavaticus)[1].
> > > > > > > It was the origin of the name “avatica”, connects and spins
> > > > > > webs, and the
> > > > > > > eponymous individual in Charlotte’s Web had rather exceptional
> > > > > > > communication skills.
> > > > > > >
> > > > > > > Julian
> > > > > > >
> > > > > > > [1] https://en.m.wikipedia.org/wiki/Barn_spider
> > > > > > >
> > > > > > > > On Aug 28, 2018, at 9:49 PM, Francis Chuang
> > > > > > mailto:francischu...@apache.org>>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > The designs I have seen so far look really good! Would it
> > > > > > also make
> > > > > > > sense to design a variant for Avatica as well? This is what
> > > > > > the current
> > > > > > > Avatica logo looks like:
> > > > > > https://calcite.apache.org/avatica/img/logo.png
> > > > > > > >
> > > > > > > > Francis
> > > > > > > >
> > > > > > > > > On 29/08/2018 7:08 AM, Vladimir Sitnikov wrote:
> > > > > > > > > Stamatis>How about something like the following:
> > > > > > > > >
> > > > > > > > > There's left-to-right vs right-to-left issue, however I
> > > > > > would claim that
> > > > > > > > > the direction of improvement is right+up.
> > > > > > > > > For instance: BTC price is good when plots go to the right
> > > > > > and go
> 

sql to rule machine learning

2019-06-02 Thread Albert
found some efforts in bringing sql to machine learning world.
https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/syntax.md

any comments ?

-- 
~~~
no mistakes
~~


Re: using calcite to read csv files from hdfs.

2019-03-07 Thread Albert
I would say calcite csv example is a good start point.

On Fri, Mar 8, 2019 at 10:10 AM 勾王敏浩  wrote:

> Hello, everyone.
> This is Wangminhao Gou. I have an urgent need now. I want to read CSV
> files from remote HDFS and parse them using calcite. Now that I can use a
> file stream to get this CSV file, and understand how to use calcite to make
> a relational table of the local csv, what should I do next?
> Looking forward to your reply.
> Best,
> Wangminhao Gou



-- 
~~~
no mistakes
~~


Re: Will the number of traits in the traitSet affect the time of Volcano optimization?

2019-01-17 Thread Albert
I guess trait really feels different from your config.  I guess `Context`
is where you are looking for, see below.

/**
 * Does nothing.
 *
 * @deprecated Previously, this method installed the cancellation-checking
 * flag for this planner, but is now deprecated. Now, you should add a
 * {@link CancelFlag} to the {@link Context} passed to the constructor.
 *
 * @param cancelFlag flag which the planner should periodically check
 */
@Deprecated // to be removed before 2.0
void setCancelFlag(CancelFlag cancelFlag);




On Fri, Jan 18, 2019 at 9:35 AM Hequn Cheng  wrote:

> Hi Stamatis,
>
> Thanks a lot for your reply. Yes, it seems the traits currently in Calcite
> are used by the optimizer. I wonder whether we can extend it for other
> use-cases. For example, I want to provide a way to the users that they can
> set memory or cpu settings for an aggregate node from the user api. These
> settings will only be used after optimization.
>
> I haven't found other ways to achieve this, so maybe using trait is a neat
> way?
>
> Best, Hequn
>
>
> On Fri, Jan 18, 2019 at 12:00 AM Stamatis Zampetakis 
> wrote:
>
> > Hi Hequn,
> >
> > I would describe traits as properties associated with RelNodes that
> provide
> > useful information to the optimizer (rules etc.) in order to generate a
> > plan.
> >
> > If the configuration you are referring to is meant to guide the optimizer
> > in generating a plan then it seems ok to use traits. If not then probably
> > you need something different.
> >
> > Can you elaborate a bit more on your usecase?
> >
> > Best,
> > Stamatis
> >
> >
> > On Tue, Jan 15, 2019, 10:57 AM Hequn Cheng  >
> > > Hi Julian,
> > >
> > > Thanks a lot for your reply and the detailed explanation. It solves my
> > > doubts well.
> > > My custom trait only contains one value, so I think that there will not
> > be
> > > a problem.
> > >
> > > May I further the email with another question:
> > > Is it ok or right to use a trait to pass configurations through
> RelNodes?
> > > For example, a configuration set from api for the aggregate and used
> > after
> > > the optimization.
> > > If not, are there any standard ways to achieve this?
> > >
> > > I haven't found any clear definition about trait. Only find comments in
> > > code: *RelTrait represents the manifestation of a relational expression
> > > trait within a trait definition.*
> > >
> > > Thank you!
> > >
> > > On Tue, Jan 15, 2019 at 3:12 AM Julian Hyde  wrote:
> > >
> > > > In most cases increasing the number of traits from one to two will
> > > > increase the planning time by a negligible amount.
> > > >
> > > > But it can increase the size of the search space. Suppose a
> particular
> > > > relational expression has 5 possible sort orders (order by x, order
> by
> > x,
> > > > y, order by (), order by z, order by x, z), and initially you have
> only
> > > the
> > > > collation trait enabled. A particular equivalence set might have 5
> > > subsets,
> > > > one for each sort order. Now let’s suppose you add the distribution
> > trait
> > > > to the mix, and there are 3 distributions (partition by (), partition
> > by
> > > x,
> > > > partition by z). Now that subset will have 15 subsets, for the
> > cartesian
> > > > product of the traits.
> > > >
> > > > A larger search space could increase the planning time (and memory
> > usage)
> > > > significantly.
> > > >
> > > > But if each trait has only one or two values I doubt that there will
> > be a
> > > > problem.
> > > >
> > > > Julian
> > > >
> > > >
> > > > > On Jan 14, 2019, at 1:38 AM, Hequn Cheng 
> > wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > I want to pass properties through RelNodes via trait and I wonder
> if
> > > the
> > > > > number of traits in traitSet will affect the time of Volcano
> > > > optimization.
> > > > > For example, increasing one traitDef to two in VolcanoPlanner.
> > > > >
> > > > > I guess the answer is No. Is it correct?
> > > > > As long as the search space is not increased, the Volcano
> > optimization
> > > > time
> > > > > will not increase. And simply increasing the number of traits alone
> > > does
> > > > > not add complexity.
> > > > >
> > > > > Furthermore, besides time, are there any other side effects if I
> > > increase
> > > > > the number of traits?
> > > > >
> > > > > Thank you very much!
> > > > >
> > > > > Best,
> > > > > Hequn
> > > >
> > > >
> > >
> >
>


-- 
~~~
no mistakes
~~


Re: [DISCUSS] Where do we draw the line?

2018-10-18 Thread Albert
haven't contributed a while.
I mostly agree with Julian's comment. I'm not a native English speaker and
I think that line of comment could be interpreted as `humorous` or may be
at least intended to.
if I was contributing that feature, I would care more if there are comments
to help push my contribution into the project. so consensus matters, things
needs to be done.

maybe Zoltan could share his feel on that review, and Vladimir could act
correspondingly.



On Fri, Oct 19, 2018 at 8:23 AM Ashutosh Chauhan 
wrote:

> I have not contributed to Calcite in a while but I keep up with whats going
> in project and actively follow mailing list and jiras of interest.
> I concur with Josh that it is public shaming and bullying. This is not
> acceptable. Also, this is not an exception but pattern which tells me that
> it will continue in future too.
> This is not in line with ASF code of conduct and respectful dialog expected
> in community.
>
> Thanks,
> Ashutosh
>
> On Thu, Oct 18, 2018 at 4:24 PM Michael Mior  wrote:
>
> > You can see that I already responded to the comment and I don't really
> have
> > many further thoughts. I do agree though that it's true that this could
> > have been intended humorously and my reaction didn't acknowledge that.
> That
> > said, it's of course worth considering with comments intended to be
> > humorous how they will be perceived.
> >
> > --
> > Michael Mior
> > mm...@apache.org
> >
> >
> > Le jeu. 18 oct. 2018 à 15:37, Julian Hyde  a écrit :
> >
> > > I’m not too concerned about the "Do you aim to get an entry in
> > > accidentallyquadratic?” comment — it could be interpreted humorously,
> if
> > it
> > > were not at a end of a long, contentious review thread.
> > >
> > > I am more concerned that it was a long contentious review thread. The
> > > problem is that Vladimir is dogmatic. He makes a point, that point is
> > > acknowledged by the other party, but he absolutely refuses to give
> > ground.
> > > This occurs on the issue of messages for assert statements, and on the
> > > issue of the O(n ^ 2) performance of the algorithm.
> > >
> > > There is no path to consensus, other than yielding to Vladimir.
> > >
> > > I have experienced this behavior also. I had fixed a bug — the
> expression
> > > “TRUE IS FALSE” was being simplified to TRUE — and Vladimir vetoed my
> fix
> > > on the “technical grounds” that I had added tests without sufficient
> > error
> > > messages. The veto left me absolutely furious, and I seriously
> considered
> > > leaving the community. I surmise that other people who are on the
> > receiving
> > > end of his criticism may feel the same way.
> > >
> > > I appreciate Vladimir’s efforts reviewing code, and I appreciate his
> high
> > > standards, but he needs to change his communication style.
> > >
> > > Perhaps it would be useful if we discuss under what circumstances a
> > > committer can veto a change. ASF policy [1] says the following:
> > >
> > > > Votes on code modifications follow a different model. In
> > > > this scenario, a negative vote constitutes a veto, which
> > > > cannot be overridden.
> > >
> > > > If the R-T-C policy is in effect, a positive vote carries the
> > > > very strong implied message, 'I have tested this patch
> > > > myself, and found it good.' Similarly, a negative vote
> > > > usually means that the patch was tested and found to
> > > > be not -good, although the veto (for such it is in this
> > > > case) may be based on other technical grounds.
> > >
> > > I think we need to clarify what “technical grounds" means. Introducing
> a
> > > security hole would certainly qualify. As would introducing a bug in
> > > user-visible functionality (if the same change were not removing a more
> > > serious bug). But in less clear-cut cases, where the purported
> “technical
> > > grounds” are disputed or subjective, I think a consensus of other
> > > committers should override a veto.
> > >
> > > To be clear, the “technical grounds” veto is very important. But if the
> > > threat of it is preventing consensus building, we need to look at it
> > > carefully. Removing the veto threat forces reviewers build consensus,
> to
> > > persuade rather than cajole; it reduces the power of committers over
> > > non-committers, and encourages us to treat each other as equals.
> > >
> > > The commit veto is the “nuclear option” and I, for one, hope that it is
> > > never used again in this project.
> > >
> > > Julian
> > >
> > > [1] https://www.apache.org/foundation/voting.html <
> > > https://www.apache.org/foundation/voting.html>
> > >
> > >
> > >
> > > > On Oct 18, 2018, at 11:35 AM, Jesus Camacho Rodriguez <
> > > jcamachorodrig...@hortonworks.com> wrote:
> > > >
> > > > Is it OK for a PMC member of this community to engage with a new
> > > contributor to the project in this way?
> > > >
> > >
> >
> https://github.com/apache/calcite/commit/b470a0cd4572c9f6c4c0e9b51926b97c5af58d3f#r30950660
> > > >
> > > > I wanted to bring everyone´s 

[jira] [Created] (CALCITE-2467) Please update OWASP Dependency Check version to current 3.3.1

2018-08-14 Thread Albert Baker (JIRA)
Albert Baker created CALCITE-2467:
-

 Summary: Please update OWASP Dependency Check version to current 
3.3.1
 Key: CALCITE-2467
 URL: https://issues.apache.org/jira/browse/CALCITE-2467
 Project: Calcite
  Issue Type: New Feature
  Components: build
Affects Versions: next, avatica-go-3.1.0, 1.18.0
Reporter: Albert Baker
Assignee: Julian Hyde


Update the pom.xml  from 2.1.1 to 3.3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Committer duties

2018-03-28 Thread Albert
I saw FLINK has a weekly update, this might be a useful way to let people
know more about what's happening.

On Wed, Mar 28, 2018 at 10:10 AM, Michael Mior  wrote:

> I wasn't suggesting that people submitting issues or PRs do the assigning.
> Aside from time, my biggest challenge to contributing more is not knowing
> who to ping since my own expertise is fairly limited. My thinking behind
> owners is just that it would make it clear who has the expertise in certain
> areas. But since the consensus seems to be against it, I'm fine with going
> with any alternative that works.
>
> --
> Michael Mior
> mm...@apache.org
>
> 2018-03-27 21:41 GMT-04:00 Julian Hyde :
>
> > I agree with Jesus. I think it’s more important to deal with the
> > management, and have a person to handle incoming PRs, than to try to
> break
> > the product into well-defined components.
> >
> > "Component owners” are problematic because it’s not always clear
> > (especially to contributors) which parts of the code a given PR touches.
> A
> > vertical feature such as CREATE TYPE (CALCITE-2045) or authorization
> > (CALCITE-2194) will often cut through everything.
> >
> > In contrast, a coordinator would just assign PRs to people with the
> > necessary expertise, not do the reviewing. The load on them would be
> light
> > enough that they might volunteer to do it in some future month, which is
> > good, because we don’t want to burn anyone out.
> >
> > Julian
> >
> >
> > > On Mar 27, 2018, at 4:38 PM, Jesus Camacho Rodriguez <
> > jcama...@apache.org> wrote:
> > >
> > > IMO the most important task is to stay on top of the issues and PRs,
> > > I am not so concerned about the other project management tasks
> > > since it is easier to do them collaboratively.
> > >
> > > I think we do not need owners for components, as it will not help the
> > > project in any way. If an owner does not review some PRs, what are
> > > we going to do? Effectively, we cannot force him/her to do it timely,
> > > and at the same time, we do not want to hold commits to the project
> > > till the owner decides to review the PR.
> > >
> > > Committers are more familiar with each other's work so we (the
> > > committers) could try to proactively monitor the mailing list for
> > > new issues and help contributors by reviewing or pinging the right
> > > reviewer for each of them. Ultimately, it is our responsibility
> > > as a community to commit those patches and keep the project
> > > moving forward. This means that we may need to step up and
> > > review a certain patch as well as we can, even if we are less
> > > familiar with a certain module.
> > >
> > > -Jesús
> > >
> > >
> > >
> > > On 3/27/18, 2:33 PM, "F21"  wrote:
> > >
> > >Hey everyone,
> > >
> > >I am happy to take ownership of the Go avatica client. I am
> currently
> > >quite busy, but I hope to test it against the latest version of
> > avatica
> > >released a couple of weeks ago and see if we can make a release for
> > it.
> > >
> > >Francis
> > >
> > >On 28/03/2018 6:27 AM, Shuyi Chen wrote:
> > >> Hi Julian and Michael,
> > >>
> > >> Thanks a lot for starting the discussion. I think the ownership model
> > is a
> > >> good idea, and has been used by other open source communities, and we
> > can
> > >> further break down core into e.g. sql parser, sql validator,
> relational
> > >> algebra, planner, JSON model, runtime and etc,.  Also, we need to add
> > the
> > >> 'server' module into the JIRA component list for DDLs. And I think
> > adding
> > >> component in the PR title will help owner to filter and identify
> issues
> > >> quickly, also I think we can use a template to enforce a more detail
> PR
> > >> description, so the reviewer can better understand the context and
> > review
> > >> the code.
> > >>
> > >> I have some knowledge in sql parser, JSON model, relational algebra
> and
> > >> planner, and is currently working on the server module to add the
> > >> type/library/function DDLs. I can definitely help on answering
> > questions on
> > >> mailing list, reviewing code and contributing PRs for these
> components.
> > >> Also, I am definitely interested in learning and helping more on
> > committing
> > >> code and doing releases as well.
> > >>
> > >> Cheers
> > >> Shuyi
> > >>
> > >>
> > >> On Tue, Mar 27, 2018 at 9:51 AM, Michael Mior 
> > wrote:
> > >>
> > >>> Thanks for starting the discussion Julian. I suggested at some point
> > in the
> > >>> past that we figure out people who are willing to take ownership over
> > >>> certain components of Calcite. It seems like this would at least be a
> > start
> > >>> to staying on top of PRs and issues. However, we would probably have
> to
> > >>> segment core practically for this to help.
> > >>>
> > >>> Another thing that comes to mind is staying on top of updates to
> > >>> dependencies. If people are owning certain components, hopefully 

Re: [ANNOUNCE] New committer: Zhen Wang

2018-03-11 Thread Albert
Thanks for the intro and for inviting me, Julian.

really honored to be part of the community, generally I'm interested in the
idea of writing higher level specifications and let the `machine/core` to
handle the performance and  other details.
standard SQL is one such example, and calcite is one such OPEN SOURCE core
organized elegantly for this purpose.
I'm looking forward to features such as MDX, Streaming SQL... being brought
to the `CORE`.
and I hope to expand my contributions during the course.

thank you all.

On Fri, Mar 9, 2018 at 12:27 PM, Michael Mior  wrote:

> Congratulations and welcome Zhen!
>
> --
> Michael Mior
> mm...@apache.org
>
> 2018-03-08 19:06 GMT-05:00 Shuyi Chen :
>
>> Congratulations, Zhen!
>>
>> On Thu, Mar 8, 2018 at 3:57 PM, Julian Hyde  wrote:
>>
>> > Apache Calcite's Project Management Committee (PMC) has invited Zhen
>> Wang
>> > to become a committer, and we are pleased to announce that he has
>> accepted.
>> >
>> > Zhen has made a variety of code contributions over the last year or so,
>> in
>> > areas as diverse as code-generation, statistics, and streaming.
>> >
>> > Zhen, welcome, thank you for your contributions, and we look forward
>> your
>> > further interactions with the community! If you wish, please feel free
>> to
>> > tell us more about yourself and what you are working on.
>> >
>> > Julian Hyde (on behalf of the Apache Calcite PMC)
>> >
>>
>>
>>
>> --
>> "So you have to trust that the dots will somehow connect in your future."
>>
>
>


-- 
~~~
no mistakes
~~


Re: Ask Advices on Calcite Streaming SQL

2016-12-12 Thread Albert
Hi,
the question seems too broad and not well framed. but in general if you
want to understand more about streaming within `calcite`.

- Firstly:  The document 
- Secondly: Julian also led several discussions in the dev thread, you
might be able to found them within mail archive.
- Last but not least: code is always there for you to reference. I suggest
you start from the CSV adapter, you might have a clearer view regarding
what to proceed after you look into that.

anyways, discussion is always welcomed here.



On Mon, Dec 12, 2016 at 5:38 PM, 邓小勇(静行) 
wrote:

> Hi there,
>
> We're working on a streaming engine used internally at
> Alibaba, and would like to support some sort of
> streaming SQL on top of it.
>
> The current plan is to follow Calcite Streaming SQL, though
> we might need to extend the current implementation in
> certain ways (e.g., allowing creating streams connecting
> to external sources/sink, multi-insert, the support for
> multiple statements in a script, different kinds of window functions,
> etc).
>
> Since we'd like to have it ready for production use
> within the next 3-6 months, I'm wondering if you have any
> suggestions or comments on how we should proceed. Thank you very much!
>
> Best,
> Xiaoyong
> Alibaba Cloud




-- 
~~~
no mistakes
~~


Re: sqlline not available after compile/install/package

2016-05-14 Thread Albert
thanks, this solves my issue.

On Sun, May 15, 2016 at 12:09 AM, Julian Hyde <jh...@apache.org> wrote:

> Try this:
>
> $ rm target/fullclasspath.txt
> $ ./sqlline
>
> Does it work now?
>
> What java version?
>
> Julian
>
> > On May 14, 2016, at 12:58 AM, Albert <zinki...@gmail.com> wrote:
> >
> > *calcite/* *(**master**)* $ ./sqlline
> >
> >  [15:55:56]
> >
> > Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
> >
> > Error: Could not find or load main class sqlline.SqlLine
> >
> >
> > Used to be there, does this mean I have to manually manage the SqlLine
> > dependency now ?
> >
> > or I missed something?
> >
> >
> > thanks.
> >
> >
> > --
> > ~~~
> > no mistakes
> > ~~
>
>


-- 
~~~
no mistakes
~~


Re: Calcite on StackOverflow

2016-03-31 Thread Albert
last time I was looking, think there were totally less than 5 questions or
so. Trend is changing. ^^

On Thu, Mar 31, 2016 at 5:36 PM, Julian Hyde  wrote:

> Looking for ways to contribute to Calcite? There are a few questions
> tagged “calcite” that do not yet have answers. It would be great if someone
> could answer them.
>
> http://stackoverflow.com/questions/tagged/calcite <
> http://stackoverflow.com/questions/tagged/calcite>
>
> Julian
>
>


-- 
~~~
no mistakes
~~


hashCode functions using multiply method

2016-01-28 Thread Albert
Hi ,
I've noticed a lot of the places, calcite codes is using something like
this:

result = 31 * result + (body != null ? body.hashCode() : 0);

using multiply in hash code calculation probably isn't best practice.
something like shift operator should be more efficient.  since the project
is already depending on guava, why not using their hash code utils ?


thanks.



-- 
~~~
no mistakes
~~


physical plan concepts in calcite

2015-12-10 Thread Albert
Hi
I haven't read anything related to that. I wonder if there are any
concepts related to that. I noticed when phoenix  adopting calcite it is
wrapping around the logical plans. thought it should only override
corresponding physical plans etc...

thanks.

-- 
~~~
no mistakes
~~