Hi Julian,

In my opinion, both ways work well. People tend to think differently. Some
prefer symptoms, others - the root cause. I personally prefer the latter
for the following reason. If I face a problem, I first try to debug it on
my own. The result of the analysis is usually some questionable behavior in
a specific part of the code. Once you find the problematic place, you can
run a search in JIRA or Git log (class name, feature name, etc) and check
whether somebody else faced a similar issue. The description "Incorrect
plan ..." is less likely to help me than more concrete "In
SubstitutionVisitor ...". Especially, given that a single root cause may
manifest in several ways. But I would like to stress out - it is a matter
of personal habits and previous experience, not something that I
expect others to follow.

In the past, I worked on the Apache Ignite project. We had a number of
contribution rules, such as "put a comma here", "set the proper component
there", "write the comment in that way", etc. I was the one who actively
enforced this for a may years, because it gave the feeling that everything
is "put in order". Eventually, I came to the conclusion that this does more
harm than good, because I regularly observed confusion and dissatisfaction
of the new contributors (and Apache Ignite community is far less diverse
and active than in Apache Calcite), as they were forced to change their
natural way of thinking or past habits to engage with the community.

Regards,
Vladimir.

ср, 12 янв. 2022 г. в 21:42, Julian Hyde <jhyde.apa...@gmail.com>:

> Hi all,
>
> Can we discuss how we write summaries for Jira cases? In my opinion it’s
> really important, because summaries become commit messages, and commit
> messages become release notes, which is how most people figure out what is
> in Calcite. I spend a lot of my time working with people to write good
> summaries.
>
> I’d like some feedback on whether this approach is useful. And to try to
> teach people how to do it for themselves.
>
> Consider this case https://issues.apache.org/jira/browse/CALCITE-4983 <
> https://issues.apache.org/jira/browse/CALCITE-4983>. (I chose a still
> current case because it doesn’t yet have an ‘answer’.)
>
> The current summary is
>
> >  In SubstitutionVisitor's unifyAggregates, if Aggregate has
> >  grouping sets, we need to handle the condition needs to pull up.
>
> It describes the cause but it doesn’t describe the problem (or the
> symptoms the user sees).
>
> If you take your car into your mechanic the cause is ‘Leaky gasket results
> in oil dripping onto hot manifold’ but the problem is ‘Smoke comes from
> hood when engine gets hot’. Do you agree that the second description is
> much more useful?
>
> In this case, the author came up with an example:
>
> > Here is an example:
> >
> > sql: select empid, deptno from emps group by grouping sets ((empid,
> deptno),(empid))
> > mv: select empid, count(distinct deptno) from emps where empid>100
> >   group by grouping sets ((empid, deptno), (empid))
> >
> > the result plan is:
> >
> >   LogicalCalc(expr#0..2=[{inputs}], deptno=[$t1], EXPR$1=[$t2])
> >     LogicalAggregate(group=[{0, 1}], groups=[[{0, 1}, {0}]],
> EXPR$1=[COUNT(DISTINCT $1)])
> >       EnumerableTableScan(table=[[hr, MV0]])
> >
> > We can see that this plan doesn't handle the condition empid>100
>
> I think it’s a great example. I especially like the last line, where the
> author pointed out what was wrong. I suggest the following summary:
>
> > Incorrect plan for query that has GROUPING SETS and WHERE
>
> Do you think the summary is more useful? Can it be improved?
>
> Julian
>
>
>
>

Reply via email to