[jira] [Created] (CALCITE-5663) [TestKit] RelOptFixture does not enforce the collation
Marieke Gueye created CALCITE-5663: -- Summary: [TestKit] RelOptFixture does not enforce the collation Key: CALCITE-5663 URL: https://issues.apache.org/jira/browse/CALCITE-5663 Project: Calcite Issue Type: Bug Reporter: Marieke Gueye In RelOptFixture, we currently change the traitsets to enforce EnumerableConvention.INSTANCE, however, in this instance we forget to port the collation in the traitset. ``` if (planner instanceof VolcanoPlanner) { r2 = planner.changeTraits(relBefore, relBefore.getTraitSet().replace(EnumerableConvention.INSTANCE)); } else { r2 = relBefore; } ``` The problem goes even deeper, as of now, there is no way to get access to the collation as it lives in the relRoot, and we currently only access the relNode through the `relSupplier` The consequence of this can be pretty dire as it may mean that some rules might be incorrectly tested. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5662) Fix CAST(BOOLEAN as INTEGER)
Oliver Lee created CALCITE-5662: --- Summary: Fix CAST(BOOLEAN as INTEGER) Key: CALCITE-5662 URL: https://issues.apache.org/jira/browse/CALCITE-5662 Project: Calcite Issue Type: Bug Reporter: Oliver Lee Assignee: Oliver Lee Currently attempting to run {{SELECT CAST(BOOLEAN as INTEGER)}} will throw a {{NumberFormatException}}. The BigQuery dialect allows casting {{boolean}} to {{int64}} and {{string}}, and not for decimal, bigdecimal, numeric, etc. src: https://cloud.google.com/bigquery/docs/reference/standard-sql/conversion_functions -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: Rewrite rule to convert self-joins into scans
Thank you, Stamatis. This is helpful. I have linked to this email thread in CALCITE-5631. > On Apr 16, 2023, at 2:44 AM, Stamatis Zampetakis wrote: > > Few quick thoughts about this. > > For to the problem of query minimization/redundant joins the simpler > scenarios that I can think of are the following: > > # Scenario A > select e1.id from emp e1 inner join emp e2 on e1.name = e2.name; > > If you know the name column is UNIQUE then you can drop the join on e2. > > # Scenario B > select e.name from emp e inner join dept d on e.dept = d.id; > > If you know that e.dept and d.id is a foreign key relationship then > you can drop the join on dept. > > There are probably other cases to define and handle but we should move > incrementally. > > As Julian pointed out, the issue logged in CALITE-5631 could also be > addressed by employing common table expression related optimizations. > CTE optimizations and query minimization are both interesting and > powerful techniques to reduce the cost of the query (whatever that is > speed, money, resources, etc). > > I would suggest focusing on query minimization first since it is > pretty well defined and we could come up with solutions much faster > than CTEs. CTEs usually come up with decisions about materializing or > not the common expressions which are closer to lower level ("physical > plan") optimizations. > > Most minimization techniques focus on select project join (SPJ) > queries so I guess we would have to do some preprocessing to bring the > plan in this format (only Scan, Project, Filter, and Join operators) > before applying the rule. It would be a separate planning phase > combining a bunch of existing rules followed by some new which is > inline with what Julian was saying about bottom-up unification. > > The new rule could be something similar to LoptOptimizeJoinRule that > operates on a MultiJoin. I haven't checked if the MultiJoin operator > is sufficient to express an SPJ query but I think the general idea of > grouping joins together seems to be a promising direction for writing > new rules. > > Best, > Stamatis > > On Sun, Apr 16, 2023 at 2:27 AM Julian Hyde wrote: >> >> Ian Bertolacci recently logged >> https://issues.apache.org/jira/browse/CALCITE-5631, to convert >> >> select >> (select numarrayagg(C5633_203) from T893 where C5633_586 = T895.id), >> (select numarrayagg(C5633_170) from T893 where C5633_586 = T895.id) >> from T895 >> >> into >> >> select agg.agg1, >> agg.agg2 >> from T895 >> left join ( >>select C5633_586, >>numarrayagg(C5633_203) as agg1, >>numarrayagg(C5633_170) as agg2 >>from T893 >>where C5633_586 is not null >>group by C5633_586) as agg >> on agg.C5633_586 = T895.id >> >> This seems to me an interesting and important problem. But it's also a >> hard problem, and it's not clear to me which approach is the best. >> Does anyone have any ideas for how to approach it? >> >> Also, we could use more example queries that illustrate the general >> pattern. (Preferably in terms of simple databases such as EMP and >> DEPT.) >> >> In Calcite rewrite rules (RelRule) are usually the preferred approach. >> Because the common relational expressions scans can be an arbitrary >> distance apart in the RelNode tree, RelRule doesn't seem suitable. >> >> There seem to be some similarities to algorithms to use materialized >> views, which use bottom-up unification. >> >> Ian's original query actually has correlated scalar sub-queries rather >> than explicit joins. Would it be better to target common sub-queries >> rather than joins? >> >> Lastly, there are similarities with the WinMagic algorithm, which >> converts correlated sub-queries into window aggregates. Is that a >> useful direction? (My implementation of measures in CALCITE-4496 >> naturally creates correlated scalar sub-queries that can be inlined in >> the enclosing query if simple, or converted to window aggregates if >> more complex.) >> >> Julian
Re: [DISCUSS] Disable JIRA worklog for GitHub PRs
+1, it would reduce a lot of noise. Best, Dan Zou > 2023年4月19日 19:37,Benchao Li 写道: > > +1, > > One thing that may affect current workflow is sometimes I only watch the > Jira, and will get notified from the PR notification. If we are going to > disable it, I need to watch the PR too to get the notification for cases > that I'm interested in both the Jira discussion and PR comments. But that > won't be a big problem for me. > > Michael Mior 于2023年4月19日周三 18:43写道: > >> +1 from me as well >> >> On Wed, Apr 19, 2023, 04:19 Stamatis Zampetakis wrote: >> >>> Hello, >>> >>> Everything that happens in a GitHub PR creates a worklog entry under >>> the respective JIRA ticket. >>> For every worklog entry we receive a notification from j...@apache.org >>> when we are watching an issue. The worklog entry and email >>> notification usually appear messy. >>> >>> Moreover, if we are watching the GitHub PR we are going to get a >>> notification from notificati...@github.com which has the same content >>> with the JIRA worklog entry and is much more readable. >>> >>> Finally, the PR notification is also going to >>> comm...@calcite.apache.org so those who are subscribed to that list >>> will get the same notification three times. >>> >>> Personally, I never read the JIRA worklog notifications and I largely >>> prefer those from notificati...@github.com. >>> >>> How do you feel about disabling the worklog entries in JIRA coming >>> from GitHub PRs? >>> >>> For archiving purposes, the notifications already go to commits@ so we >>> don't lose anything from disabling the worklog entries. On the >>> contrary, I find that this would reduce the noise and redundancy on >>> our inboxes. >>> >>> Concretely this is what I have in mind in terms of change: >>> https://github.com/apache/calcite/pull/3166 >>> >>> Best, >>> Stamatis >>> >> > > > -- > > Best, > Benchao Li
Re: [DISCUSS] Disable JIRA worklog for GitHub PRs
+1, One thing that may affect current workflow is sometimes I only watch the Jira, and will get notified from the PR notification. If we are going to disable it, I need to watch the PR too to get the notification for cases that I'm interested in both the Jira discussion and PR comments. But that won't be a big problem for me. Michael Mior 于2023年4月19日周三 18:43写道: > +1 from me as well > > On Wed, Apr 19, 2023, 04:19 Stamatis Zampetakis wrote: > > > Hello, > > > > Everything that happens in a GitHub PR creates a worklog entry under > > the respective JIRA ticket. > > For every worklog entry we receive a notification from j...@apache.org > > when we are watching an issue. The worklog entry and email > > notification usually appear messy. > > > > Moreover, if we are watching the GitHub PR we are going to get a > > notification from notificati...@github.com which has the same content > > with the JIRA worklog entry and is much more readable. > > > > Finally, the PR notification is also going to > > comm...@calcite.apache.org so those who are subscribed to that list > > will get the same notification three times. > > > > Personally, I never read the JIRA worklog notifications and I largely > > prefer those from notificati...@github.com. > > > > How do you feel about disabling the worklog entries in JIRA coming > > from GitHub PRs? > > > > For archiving purposes, the notifications already go to commits@ so we > > don't lose anything from disabling the worklog entries. On the > > contrary, I find that this would reduce the noise and redundancy on > > our inboxes. > > > > Concretely this is what I have in mind in terms of change: > > https://github.com/apache/calcite/pull/3166 > > > > Best, > > Stamatis > > > -- Best, Benchao Li
Re: [DISCUSS] Disable JIRA worklog for GitHub PRs
+1 from me as well On Wed, Apr 19, 2023, 04:19 Stamatis Zampetakis wrote: > Hello, > > Everything that happens in a GitHub PR creates a worklog entry under > the respective JIRA ticket. > For every worklog entry we receive a notification from j...@apache.org > when we are watching an issue. The worklog entry and email > notification usually appear messy. > > Moreover, if we are watching the GitHub PR we are going to get a > notification from notificati...@github.com which has the same content > with the JIRA worklog entry and is much more readable. > > Finally, the PR notification is also going to > comm...@calcite.apache.org so those who are subscribed to that list > will get the same notification three times. > > Personally, I never read the JIRA worklog notifications and I largely > prefer those from notificati...@github.com. > > How do you feel about disabling the worklog entries in JIRA coming > from GitHub PRs? > > For archiving purposes, the notifications already go to commits@ so we > don't lose anything from disabling the worklog entries. On the > contrary, I find that this would reduce the noise and redundancy on > our inboxes. > > Concretely this is what I have in mind in terms of change: > https://github.com/apache/calcite/pull/3166 > > Best, > Stamatis >
Re: [DISCUSS] Disable JIRA worklog for GitHub PRs
+1 On Wed, Apr 19, 2023 at 9:33 AM Francis Chuang wrote: > +1 for this. > > Also noticed this and found it to be annoying. > > On 19/04/2023 6:22 pm, Alessandro Solimando wrote: > > Hi Stamatis, > > +1000 on this, thanks for starting this discussion! > > > > Best regards, > > Alessandro > > > > On Wed 19 Apr 2023, 10:19 Stamatis Zampetakis, > wrote: > > > >> Hello, > >> > >> Everything that happens in a GitHub PR creates a worklog entry under > >> the respective JIRA ticket. > >> For every worklog entry we receive a notification from j...@apache.org > >> when we are watching an issue. The worklog entry and email > >> notification usually appear messy. > >> > >> Moreover, if we are watching the GitHub PR we are going to get a > >> notification from notificati...@github.com which has the same content > >> with the JIRA worklog entry and is much more readable. > >> > >> Finally, the PR notification is also going to > >> comm...@calcite.apache.org so those who are subscribed to that list > >> will get the same notification three times. > >> > >> Personally, I never read the JIRA worklog notifications and I largely > >> prefer those from notificati...@github.com. > >> > >> How do you feel about disabling the worklog entries in JIRA coming > >> from GitHub PRs? > >> > >> For archiving purposes, the notifications already go to commits@ so we > >> don't lose anything from disabling the worklog entries. On the > >> contrary, I find that this would reduce the noise and redundancy on > >> our inboxes. > >> > >> Concretely this is what I have in mind in terms of change: > >> https://github.com/apache/calcite/pull/3166 > >> > >> Best, > >> Stamatis > >> > > >
Re: [DISCUSS] Disable JIRA worklog for GitHub PRs
+1 for this. Also noticed this and found it to be annoying. On 19/04/2023 6:22 pm, Alessandro Solimando wrote: Hi Stamatis, +1000 on this, thanks for starting this discussion! Best regards, Alessandro On Wed 19 Apr 2023, 10:19 Stamatis Zampetakis, wrote: Hello, Everything that happens in a GitHub PR creates a worklog entry under the respective JIRA ticket. For every worklog entry we receive a notification from j...@apache.org when we are watching an issue. The worklog entry and email notification usually appear messy. Moreover, if we are watching the GitHub PR we are going to get a notification from notificati...@github.com which has the same content with the JIRA worklog entry and is much more readable. Finally, the PR notification is also going to comm...@calcite.apache.org so those who are subscribed to that list will get the same notification three times. Personally, I never read the JIRA worklog notifications and I largely prefer those from notificati...@github.com. How do you feel about disabling the worklog entries in JIRA coming from GitHub PRs? For archiving purposes, the notifications already go to commits@ so we don't lose anything from disabling the worklog entries. On the contrary, I find that this would reduce the noise and redundancy on our inboxes. Concretely this is what I have in mind in terms of change: https://github.com/apache/calcite/pull/3166 Best, Stamatis
Re: [DISCUSS] Disable JIRA worklog for GitHub PRs
Hi Stamatis, +1000 on this, thanks for starting this discussion! Best regards, Alessandro On Wed 19 Apr 2023, 10:19 Stamatis Zampetakis, wrote: > Hello, > > Everything that happens in a GitHub PR creates a worklog entry under > the respective JIRA ticket. > For every worklog entry we receive a notification from j...@apache.org > when we are watching an issue. The worklog entry and email > notification usually appear messy. > > Moreover, if we are watching the GitHub PR we are going to get a > notification from notificati...@github.com which has the same content > with the JIRA worklog entry and is much more readable. > > Finally, the PR notification is also going to > comm...@calcite.apache.org so those who are subscribed to that list > will get the same notification three times. > > Personally, I never read the JIRA worklog notifications and I largely > prefer those from notificati...@github.com. > > How do you feel about disabling the worklog entries in JIRA coming > from GitHub PRs? > > For archiving purposes, the notifications already go to commits@ so we > don't lose anything from disabling the worklog entries. On the > contrary, I find that this would reduce the noise and redundancy on > our inboxes. > > Concretely this is what I have in mind in terms of change: > https://github.com/apache/calcite/pull/3166 > > Best, > Stamatis >
[DISCUSS] Disable JIRA worklog for GitHub PRs
Hello, Everything that happens in a GitHub PR creates a worklog entry under the respective JIRA ticket. For every worklog entry we receive a notification from j...@apache.org when we are watching an issue. The worklog entry and email notification usually appear messy. Moreover, if we are watching the GitHub PR we are going to get a notification from notificati...@github.com which has the same content with the JIRA worklog entry and is much more readable. Finally, the PR notification is also going to comm...@calcite.apache.org so those who are subscribed to that list will get the same notification three times. Personally, I never read the JIRA worklog notifications and I largely prefer those from notificati...@github.com. How do you feel about disabling the worklog entries in JIRA coming from GitHub PRs? For archiving purposes, the notifications already go to commits@ so we don't lose anything from disabling the worklog entries. On the contrary, I find that this would reduce the noise and redundancy on our inboxes. Concretely this is what I have in mind in terms of change: https://github.com/apache/calcite/pull/3166 Best, Stamatis