[jira] [Created] (CALCITE-5663) [TestKit] RelOptFixture does not enforce the collation

2023-04-19 Thread Marieke Gueye (Jira)
Marieke Gueye created CALCITE-5663:
--

 Summary: [TestKit] RelOptFixture does not enforce the collation
 Key: CALCITE-5663
 URL: https://issues.apache.org/jira/browse/CALCITE-5663
 Project: Calcite
  Issue Type: Bug
Reporter: Marieke Gueye


In RelOptFixture, we currently change the traitsets to enforce  
EnumerableConvention.INSTANCE, however, in this instance we forget to port the 
collation in the traitset.

```
if (planner instanceof VolcanoPlanner) {
r2 =
planner.changeTraits(relBefore,
relBefore.getTraitSet().replace(EnumerableConvention.INSTANCE));
} else {
r2 = relBefore;
}
```
The problem goes even deeper, as of now, there is no way to get access to the 
collation as it lives in the relRoot, and we currently only access the relNode 
through  the `relSupplier`
 
The consequence of this can be pretty dire as it may mean that some rules might 
be incorrectly tested.
 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5662) Fix CAST(BOOLEAN as INTEGER)

2023-04-19 Thread Oliver Lee (Jira)
Oliver Lee created CALCITE-5662:
---

 Summary: Fix CAST(BOOLEAN as INTEGER)
 Key: CALCITE-5662
 URL: https://issues.apache.org/jira/browse/CALCITE-5662
 Project: Calcite
  Issue Type: Bug
Reporter: Oliver Lee
Assignee: Oliver Lee


Currently attempting to run {{SELECT CAST(BOOLEAN as INTEGER)}} will throw a 
{{NumberFormatException}}.

The BigQuery dialect allows casting {{boolean}} to {{int64}} and {{string}}, 
and not for decimal, bigdecimal, numeric, etc. 

src: 
https://cloud.google.com/bigquery/docs/reference/standard-sql/conversion_functions
 





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Rewrite rule to convert self-joins into scans

2023-04-19 Thread Julian Hyde
Thank you, Stamatis. This is helpful. I have linked to this email thread in 
CALCITE-5631.

> On Apr 16, 2023, at 2:44 AM, Stamatis Zampetakis  wrote:
> 
> Few quick thoughts about this.
> 
> For to the problem of query minimization/redundant joins the simpler
> scenarios that I can think of are the following:
> 
> # Scenario A
> select e1.id from emp e1 inner join emp e2 on e1.name = e2.name;
> 
> If you know the name column is UNIQUE then you can drop the join on e2.
> 
> # Scenario B
> select e.name from emp e inner join dept d on e.dept = d.id;
> 
> If you know that e.dept and d.id is a foreign key relationship then
> you can drop the join on dept.
> 
> There are probably other cases to define and handle but we should move
> incrementally.
> 
> As Julian pointed out, the issue logged  in CALITE-5631 could also be
> addressed by employing common table expression related optimizations.
> CTE optimizations and query minimization are both interesting and
> powerful techniques to reduce the cost of the query (whatever that is
> speed, money, resources, etc).
> 
> I would suggest focusing on query minimization first since it is
> pretty well defined and we could come up with solutions much faster
> than CTEs. CTEs usually come up with decisions about materializing or
> not the common expressions which are closer to lower level ("physical
> plan") optimizations.
> 
> Most minimization techniques focus on select project join (SPJ)
> queries so I guess we would have to do some preprocessing to bring the
> plan in this format (only Scan, Project, Filter, and Join operators)
> before applying the rule. It would be a separate planning phase
> combining a bunch of existing rules followed by some new which is
> inline with what Julian was saying about bottom-up unification.
> 
> The new rule could be something similar to LoptOptimizeJoinRule that
> operates on a MultiJoin. I haven't checked if the MultiJoin operator
> is sufficient to express an SPJ query but I think the general idea of
> grouping joins together seems to be a promising direction for writing
> new rules.
> 
> Best,
> Stamatis
> 
> On Sun, Apr 16, 2023 at 2:27 AM Julian Hyde  wrote:
>> 
>> Ian Bertolacci recently logged
>> https://issues.apache.org/jira/browse/CALCITE-5631, to convert
>> 
>>  select
>> (select numarrayagg(C5633_203) from T893 where C5633_586 = T895.id),
>> (select numarrayagg(C5633_170) from T893 where C5633_586 = T895.id)
>>  from T895
>> 
>> into
>> 
>>  select agg.agg1,
>>  agg.agg2
>>  from T895
>>  left join (
>>select C5633_586,
>>numarrayagg(C5633_203) as agg1,
>>numarrayagg(C5633_170) as agg2
>>from T893
>>where C5633_586 is not null
>>group by C5633_586) as agg
>>  on agg.C5633_586 = T895.id
>> 
>> This seems to me an interesting and important problem. But it's also a
>> hard problem, and it's not clear to me which approach is the best.
>> Does anyone have any ideas for how to approach it?
>> 
>> Also, we could use more example queries that illustrate the general
>> pattern.  (Preferably in terms of simple databases such as EMP and
>> DEPT.)
>> 
>> In Calcite rewrite rules (RelRule) are usually the preferred approach.
>> Because the common relational expressions scans can be an arbitrary
>> distance apart in the RelNode tree, RelRule doesn't seem suitable.
>> 
>> There seem to be some similarities to algorithms to use materialized
>> views, which use bottom-up unification.
>> 
>> Ian's original query actually has correlated scalar sub-queries rather
>> than explicit joins. Would it be better to target common sub-queries
>> rather than joins?
>> 
>> Lastly, there are similarities with the WinMagic algorithm, which
>> converts correlated sub-queries into window aggregates. Is that a
>> useful direction? (My implementation of measures in CALCITE-4496
>> naturally creates correlated scalar sub-queries that can be inlined in
>> the enclosing query if simple, or converted to window aggregates if
>> more complex.)
>> 
>> Julian



Re: [DISCUSS] Disable JIRA worklog for GitHub PRs

2023-04-19 Thread Dan Zou
+1, it would reduce a lot of noise.

Best,
Dan Zou   





> 2023年4月19日 19:37,Benchao Li  写道:
> 
> +1,
> 
> One thing that may affect current workflow is sometimes I only watch the
> Jira, and will get notified from the PR notification. If we are going to
> disable it, I need to watch the PR too to get the notification for cases
> that I'm interested in both the Jira discussion and PR comments. But that
> won't be a big problem for me.
> 
> Michael Mior  于2023年4月19日周三 18:43写道:
> 
>> +1 from me as well
>> 
>> On Wed, Apr 19, 2023, 04:19 Stamatis Zampetakis  wrote:
>> 
>>> Hello,
>>> 
>>> Everything that happens in a GitHub PR creates a worklog entry under
>>> the respective JIRA ticket.
>>> For every worklog entry we receive a notification from j...@apache.org
>>> when we are watching an issue. The worklog entry and email
>>> notification usually appear messy.
>>> 
>>> Moreover, if we are watching the GitHub PR we are going to get a
>>> notification from notificati...@github.com which has the same content
>>> with the JIRA worklog entry and is much more readable.
>>> 
>>> Finally, the PR notification is also going to
>>> comm...@calcite.apache.org so those who are subscribed to that list
>>> will get the same notification three times.
>>> 
>>> Personally, I never read the JIRA worklog notifications and I largely
>>> prefer those from notificati...@github.com.
>>> 
>>> How do you feel about disabling the worklog entries in JIRA coming
>>> from GitHub PRs?
>>> 
>>> For archiving purposes, the notifications already go to commits@ so we
>>> don't lose anything from disabling the worklog entries. On the
>>> contrary, I find that this would reduce the noise and redundancy on
>>> our inboxes.
>>> 
>>> Concretely this is what I have in mind in terms of change:
>>> https://github.com/apache/calcite/pull/3166
>>> 
>>> Best,
>>> Stamatis
>>> 
>> 
> 
> 
> -- 
> 
> Best,
> Benchao Li



Re: [DISCUSS] Disable JIRA worklog for GitHub PRs

2023-04-19 Thread Benchao Li
+1,

One thing that may affect current workflow is sometimes I only watch the
Jira, and will get notified from the PR notification. If we are going to
disable it, I need to watch the PR too to get the notification for cases
that I'm interested in both the Jira discussion and PR comments. But that
won't be a big problem for me.

Michael Mior  于2023年4月19日周三 18:43写道:

> +1 from me as well
>
> On Wed, Apr 19, 2023, 04:19 Stamatis Zampetakis  wrote:
>
> > Hello,
> >
> > Everything that happens in a GitHub PR creates a worklog entry under
> > the respective JIRA ticket.
> > For every worklog entry we receive a notification from j...@apache.org
> > when we are watching an issue. The worklog entry and email
> > notification usually appear messy.
> >
> > Moreover, if we are watching the GitHub PR we are going to get a
> > notification from notificati...@github.com which has the same content
> > with the JIRA worklog entry and is much more readable.
> >
> > Finally, the PR notification is also going to
> > comm...@calcite.apache.org so those who are subscribed to that list
> > will get the same notification three times.
> >
> > Personally, I never read the JIRA worklog notifications and I largely
> > prefer those from notificati...@github.com.
> >
> > How do you feel about disabling the worklog entries in JIRA coming
> > from GitHub PRs?
> >
> > For archiving purposes, the notifications already go to commits@ so we
> > don't lose anything from disabling the worklog entries. On the
> > contrary, I find that this would reduce the noise and redundancy on
> > our inboxes.
> >
> > Concretely this is what I have in mind in terms of change:
> > https://github.com/apache/calcite/pull/3166
> >
> > Best,
> > Stamatis
> >
>


-- 

Best,
Benchao Li


Re: [DISCUSS] Disable JIRA worklog for GitHub PRs

2023-04-19 Thread Michael Mior
+1 from me as well

On Wed, Apr 19, 2023, 04:19 Stamatis Zampetakis  wrote:

> Hello,
>
> Everything that happens in a GitHub PR creates a worklog entry under
> the respective JIRA ticket.
> For every worklog entry we receive a notification from j...@apache.org
> when we are watching an issue. The worklog entry and email
> notification usually appear messy.
>
> Moreover, if we are watching the GitHub PR we are going to get a
> notification from notificati...@github.com which has the same content
> with the JIRA worklog entry and is much more readable.
>
> Finally, the PR notification is also going to
> comm...@calcite.apache.org so those who are subscribed to that list
> will get the same notification three times.
>
> Personally, I never read the JIRA worklog notifications and I largely
> prefer those from notificati...@github.com.
>
> How do you feel about disabling the worklog entries in JIRA coming
> from GitHub PRs?
>
> For archiving purposes, the notifications already go to commits@ so we
> don't lose anything from disabling the worklog entries. On the
> contrary, I find that this would reduce the noise and redundancy on
> our inboxes.
>
> Concretely this is what I have in mind in terms of change:
> https://github.com/apache/calcite/pull/3166
>
> Best,
> Stamatis
>


Re: [DISCUSS] Disable JIRA worklog for GitHub PRs

2023-04-19 Thread Ruben Q L
+1

On Wed, Apr 19, 2023 at 9:33 AM Francis Chuang 
wrote:

> +1 for this.
>
> Also noticed this and found it to be annoying.
>
> On 19/04/2023 6:22 pm, Alessandro Solimando wrote:
> > Hi Stamatis,
> > +1000 on this, thanks for starting this discussion!
> >
> > Best regards,
> > Alessandro
> >
> > On Wed 19 Apr 2023, 10:19 Stamatis Zampetakis, 
> wrote:
> >
> >> Hello,
> >>
> >> Everything that happens in a GitHub PR creates a worklog entry under
> >> the respective JIRA ticket.
> >> For every worklog entry we receive a notification from j...@apache.org
> >> when we are watching an issue. The worklog entry and email
> >> notification usually appear messy.
> >>
> >> Moreover, if we are watching the GitHub PR we are going to get a
> >> notification from notificati...@github.com which has the same content
> >> with the JIRA worklog entry and is much more readable.
> >>
> >> Finally, the PR notification is also going to
> >> comm...@calcite.apache.org so those who are subscribed to that list
> >> will get the same notification three times.
> >>
> >> Personally, I never read the JIRA worklog notifications and I largely
> >> prefer those from notificati...@github.com.
> >>
> >> How do you feel about disabling the worklog entries in JIRA coming
> >> from GitHub PRs?
> >>
> >> For archiving purposes, the notifications already go to commits@ so we
> >> don't lose anything from disabling the worklog entries. On the
> >> contrary, I find that this would reduce the noise and redundancy on
> >> our inboxes.
> >>
> >> Concretely this is what I have in mind in terms of change:
> >> https://github.com/apache/calcite/pull/3166
> >>
> >> Best,
> >> Stamatis
> >>
> >
>


Re: [DISCUSS] Disable JIRA worklog for GitHub PRs

2023-04-19 Thread Francis Chuang

+1 for this.

Also noticed this and found it to be annoying.

On 19/04/2023 6:22 pm, Alessandro Solimando wrote:

Hi Stamatis,
+1000 on this, thanks for starting this discussion!

Best regards,
Alessandro

On Wed 19 Apr 2023, 10:19 Stamatis Zampetakis,  wrote:


Hello,

Everything that happens in a GitHub PR creates a worklog entry under
the respective JIRA ticket.
For every worklog entry we receive a notification from j...@apache.org
when we are watching an issue. The worklog entry and email
notification usually appear messy.

Moreover, if we are watching the GitHub PR we are going to get a
notification from notificati...@github.com which has the same content
with the JIRA worklog entry and is much more readable.

Finally, the PR notification is also going to
comm...@calcite.apache.org so those who are subscribed to that list
will get the same notification three times.

Personally, I never read the JIRA worklog notifications and I largely
prefer those from notificati...@github.com.

How do you feel about disabling the worklog entries in JIRA coming
from GitHub PRs?

For archiving purposes, the notifications already go to commits@ so we
don't lose anything from disabling the worklog entries. On the
contrary, I find that this would reduce the noise and redundancy on
our inboxes.

Concretely this is what I have in mind in terms of change:
https://github.com/apache/calcite/pull/3166

Best,
Stamatis





Re: [DISCUSS] Disable JIRA worklog for GitHub PRs

2023-04-19 Thread Alessandro Solimando
Hi Stamatis,
+1000 on this, thanks for starting this discussion!

Best regards,
Alessandro

On Wed 19 Apr 2023, 10:19 Stamatis Zampetakis,  wrote:

> Hello,
>
> Everything that happens in a GitHub PR creates a worklog entry under
> the respective JIRA ticket.
> For every worklog entry we receive a notification from j...@apache.org
> when we are watching an issue. The worklog entry and email
> notification usually appear messy.
>
> Moreover, if we are watching the GitHub PR we are going to get a
> notification from notificati...@github.com which has the same content
> with the JIRA worklog entry and is much more readable.
>
> Finally, the PR notification is also going to
> comm...@calcite.apache.org so those who are subscribed to that list
> will get the same notification three times.
>
> Personally, I never read the JIRA worklog notifications and I largely
> prefer those from notificati...@github.com.
>
> How do you feel about disabling the worklog entries in JIRA coming
> from GitHub PRs?
>
> For archiving purposes, the notifications already go to commits@ so we
> don't lose anything from disabling the worklog entries. On the
> contrary, I find that this would reduce the noise and redundancy on
> our inboxes.
>
> Concretely this is what I have in mind in terms of change:
> https://github.com/apache/calcite/pull/3166
>
> Best,
> Stamatis
>


[DISCUSS] Disable JIRA worklog for GitHub PRs

2023-04-19 Thread Stamatis Zampetakis
Hello,

Everything that happens in a GitHub PR creates a worklog entry under
the respective JIRA ticket.
For every worklog entry we receive a notification from j...@apache.org
when we are watching an issue. The worklog entry and email
notification usually appear messy.

Moreover, if we are watching the GitHub PR we are going to get a
notification from notificati...@github.com which has the same content
with the JIRA worklog entry and is much more readable.

Finally, the PR notification is also going to
comm...@calcite.apache.org so those who are subscribed to that list
will get the same notification three times.

Personally, I never read the JIRA worklog notifications and I largely
prefer those from notificati...@github.com.

How do you feel about disabling the worklog entries in JIRA coming
from GitHub PRs?

For archiving purposes, the notifications already go to commits@ so we
don't lose anything from disabling the worklog entries. On the
contrary, I find that this would reduce the noise and redundancy on
our inboxes.

Concretely this is what I have in mind in terms of change:
https://github.com/apache/calcite/pull/3166

Best,
Stamatis