Re: Is there a mechanism for constant folding in Calcite?

2021-10-27 Thread Stamatis Zampetakis
Hi Ian,

As Alessandro correctly said, the engine relies on scalar subqueries being
unnested. Rules are not applied to RexSubquery expressions.
In a relatively recent discussion [1] somebody else also expressed the need
to retain nesting in scalar subqueries.
I think that if somebody pushes this forward it could be something that
could be incorporated to the core.

Best,
Stamatis

[1]
https://lists.apache.org/thread.html/r17369e6319e2d810fdeb99bdc6fd90cf5dfd5648fb2f85589b4cc12d%40%3Cdev.calcite.apache.org%3E

On Tue, Oct 26, 2021 at 8:57 PM Alessandro Solimando <
alessandro.solima...@gmail.com> wrote:

> Hi Ian,
> regarding commutativity/associativity I think this ML discussion
> <
> https://lists.apache.org/thread.html/r68b60538222d01a3cf065e44f8dbad7c23d042350d9cd6b7b52ee811%40%3Cdev.calcite.apache.org%3E
> >
> could
> be relevant and it has some pointers.
>
> For what concerns decorrelation, I think that most of Calcite's code relies
> on subqueries being decorrelated, like in the example you have cited.
> I recall some ML discussions where this topic has been elaborated but I
> can't find any right now.
>
> Best regards,
> Alessandro
>


Re: Review request: CALCITE-4818 (AggregateExpandDistinctAggregatesRule must infer correct data type for top aggregate calls)

2021-10-27 Thread Taras Ledkov

Hi Calcite team,

Gently reminder about review / merge the patch for  CALCITE-4818 [1]

[1]. https://issues.apache.org/jira/browse/CALCITE-4818

On 15.10.2021 15:16, Taras Ledkov wrote:

Hi,

Gently reminder about review.

On 04.10.2021 20:30, Taras Ledkov wrote:

Hi,

Please review the patch for the issue CALCITE-4818 [1], see PR#2560 [2].

Looks like the rule 'AggregateExpandDistinctAggregatesRule' contains 
another bug with inferring result type of the top aggregate calls.


e.g.

If the type system expand sum type like postgress:
SUM(TINYINT | SMALLINT | INTEGER) ->  BIGINT
SUM(BIGINT) -> DECIMAL

The query:
SELECT SUM(i), SUM(DISTINCT i) FROM TBL;
where i - INTEGER field.

produces the plan:
LogicalProject(EXPR$0=[CAST($0):BIGINT], EXPR$1=[$1])
 LogicalAggregate(group=[{}], EXPR$0=[SUM($1)], EXPR$1=[SUM($0)]) 
-->  RowType[DECIMAL, BIGINT]
    LogicalAggregate(group=[{0}], 
EXPR$0=[SUM($0)]) --> 
RowType[INTEGER, BIGINT]

  LogicalProject(COMM=[$6])
    LogicalTableScan(table=[[CATALOG, SALES, EMP]])

But the rule ignores the change row type in the bottom aggregate.

I propose the simple fix: pass 'null' type to the 
'AggregateCall.create' call to infer aggregate type from the input.


[1]. https://issues.apache.org/jira/browse/CALCITE-4818
[2]. https://github.com/apache/calcite/pull/2560


--
Taras Ledkov
Mail-To: tled...@gridgain.com



Proposal: Better abstraction/encapsulation for RelMetadataQuery

2021-10-27 Thread Jacques Nadeau
Hey all,

I've been working on AOT compilation with Graal (Janino not usable) and I'm
struggling with the hierarchy of classes related to RelMetadataQuery. Right
now it feels like there is tight coupling between how the
JaninoRelMetadataProvider works and RelMetdataQuery. In a perfect world, it
seems like the current implementation should be separated into
JaninoRelMetdataQuery and RelMetdataQuery.
- JaninoRelMetdataQuery: The current version of RelMetdataQuery (roughly)
- RelMetdataQuery: An abstract class with minimal implementation and
doesn't extend RelMetadataQueryBase (which is fully coupled
to JaninoRelMetadataProvider)

Existing users who extend RelMetadatQuery today would move to extending
JaninoRelMetdataQuery. Some examples of the current problematic
encapsulation:
- Weird behavior in RelOptCluster: [1]
- Several internal concerns leaked via public fields in
RelMetadataQueryBase [2]
- No way to override "initialHandler" in RelMetadataQueryBase because it is
declared as a static method for [3]

Does anyone have strong opinions about this type of change?

FYI, I know one of the key points of focus on this code was performance but
unfortunately I don't notice any benchmarks in Calcite focused on this code
(did I miss some?).

[1]
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L156
[2]
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQueryBase.java#L69
[3]
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQueryBase.java#L84


[jira] [Created] (CALCITE-4868) ES adapter doesn't support sort aggregation result

2021-10-27 Thread ZheHu (Jira)
ZheHu created CALCITE-4868:
--

 Summary: ES adapter doesn't support sort aggregation result
 Key: CALCITE-4868
 URL: https://issues.apache.org/jira/browse/CALCITE-4868
 Project: Calcite
  Issue Type: Bug
  Components: elasticsearch-adapter
Affects Versions: 1.28.0
Reporter: ZheHu


When I run the following SQL in ES Adapter module(AggregationTest.java):
{code:java}
@Test void testSortAggregation() {
CalciteAssert.that()
.with(newConnectionFactory())
.query("select cat5, max(val1) as MAX_VAL1 from view group by cat5 
order by MAX_VAL1 desc, cat5 desc")
.returns("cat5=2; val1=7\n");
  }
{code}

I get such exception:
{code:java}
java.lang.IllegalArgumentException: Field MAX_VAL1 not defined for aggs
at 
org.apache.calcite.adapter.elasticsearch.ElasticsearchMapping.missingValueFor(ElasticsearchMapping.java:85)
at 
org.apache.calcite.adapter.elasticsearch.ElasticsearchTable.aggregate(ElasticsearchTable.java:226)
at 
org.apache.calcite.adapter.elasticsearch.ElasticsearchTable.find(ElasticsearchTable.java:120)
at 
org.apache.calcite.adapter.elasticsearch.ElasticsearchTable.access$000(ElasticsearchTable.java:61)
at 
org.apache.calcite.adapter.elasticsearch.ElasticsearchTable$ElasticsearchQueryable.find(ElasticsearchTable.java:376)
at 
org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:363)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:338)
at 
org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:578)
at 
org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:569)
at 
org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:184)
at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:64)
at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:43)
at 
org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
at 
org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:638)
at 
org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
at 
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
{code}

AS a result of the ES mapping doesn't contain field "MAX_VAL1", such exception 
occurred.

According to the SQL, generated ES scripts should be(I have run it in ES 7.3.2):
{code:java}
{
  "aggs": {
"cat5_val1": {
  "terms": {
"field": "cat5",
"order": [
  {
"_key": "desc"
  },
  {
"MAX_VAL1": "desc"
  }
]
  },
  "aggs": {
"MAX_VAL1": {
  "max": {
"field": "val1"
  }
}
  }
}
  }
}
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Proposal: Better abstraction/encapsulation for RelMetadataQuery

2021-10-27 Thread James Starr
Hi Jacques,

I have a PR[1] for decoupling the JaninoRelMetadataProvider from the
RelMetadataQuery.  Instead of using inheritance and requiring downstream
projects to update their code, composition was used for the creation of the
initial handler stub, handler creation, and creation of a cache.  This does
not require downstream projects to extend a class that is coupled with how
the handlers are generated and maintains all of the existing API while
decoupling the handler creation from the RelMetadataQuery.  The weird
behavior in RelOptCluster[2] is encapsulated into a single class and easily
disregarded by using a different implementation.  My implementation does
still have a public cache object in RelMetadataQueryBase which is how the
VolcanoPlanner currently invalidates cache.

Previously, JdbcTest.testJoinFiveWay was used to benchmark metadata.  I ran
some benchmarks a while back[3], and very similar numbers to what had
reported Julian in an email thread when the initial move to Janino was
done.   JdbcTest.testJoinManyWay is another good candidate for generating a
large number of metadata calls.

Overall, I would prefer a change that did not require downstream projects
to extend classes that exposed internal implementation and require them to
change their code.

James

[1]
https://github.com/apache/calcite/pull/2378
[2]
core/src/main/java/org/apache/calcite/rel/metadata/JaninoMetadataHandlerProvider.java
[3]
https://issues.apache.org/jira/browse/CALCITE-4546

On Wed, Oct 27, 2021 at 5:11 PM Jacques Nadeau  wrote:

> Hey all,
>
> I've been working on AOT compilation with Graal (Janino not usable) and I'm
> struggling with the hierarchy of classes related to RelMetadataQuery. Right
> now it feels like there is tight coupling between how the
> JaninoRelMetadataProvider works and RelMetdataQuery. In a perfect world, it
> seems like the current implementation should be separated into
> JaninoRelMetdataQuery and RelMetdataQuery.
> - JaninoRelMetdataQuery: The current version of RelMetdataQuery (roughly)
> - RelMetdataQuery: An abstract class with minimal implementation and
> doesn't extend RelMetadataQueryBase (which is fully coupled
> to JaninoRelMetadataProvider)
>
> Existing users who extend RelMetadatQuery today would move to extending
> JaninoRelMetdataQuery. Some examples of the current problematic
> encapsulation:
> - Weird behavior in RelOptCluster: [1]
> - Several internal concerns leaked via public fields in
> RelMetadataQueryBase [2]
> - No way to override "initialHandler" in RelMetadataQueryBase because it is
> declared as a static method for [3]
>
> Does anyone have strong opinions about this type of change?
>
> FYI, I know one of the key points of focus on this code was performance but
> unfortunately I don't notice any benchmarks in Calcite focused on this code
> (did I miss some?).
>
> [1]
>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/RelOptCluster.java#L156
> [2]
>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQueryBase.java#L69
> [3]
>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQueryBase.java#L84
>


Re: [ANNOUNCE] New committer: Xiong Duan

2021-10-27 Thread JiaTao Tao
Congratulations

Regards!

Aron Tao


XING JIN  于2021年10月26日周二 下午1:17写道:

> Congratulations !
>
> Best,
> Jin
>
> Chunwei Lei  于2021年10月25日周一 下午4:58写道:
>
> > Congratulations, Xiong!
> >
> >
> > Best,
> > Chunwei
> >
> >
> > On Sun, Oct 24, 2021 at 6:34 AM Haisheng Yuan  wrote:
> >
> > > Congrats, Xiong!
> > >
> > > On 2021/10/23 21:23:59, Francis Chuang 
> wrote:
> > > > Congratulations!
> > > >
> > > > On 24/10/2021 12:03 am, Stamatis Zampetakis wrote:
> > > > > Apache Calcite's Project Management Committee (PMC) has invited
> Xiong
> > > Duan
> > > > > to
> > > > > become a committer, and we are pleased to announce that they have
> > > accepted.
> > > > >
> > > > > Xiong has pushed a lot of high quality patches, fixing and
> improving
> > > code
> > > > > around
> > > > > aggregations and sub-queries,  in a rather short period of time.
> > Apart
> > > from
> > > > > code
> > > > > contributions, Xiong has been regularly reviewing PRs in GitHub and
> > > helping
> > > > > out
> > > > > others in various JIRA issues.
> > > > >
> > > > > Xiong, welcome, thank you for your contributions, and we look
> forward
> > > to
> > > > > your
> > > > > further interactions with the community! If you wish, please feel
> > free
> > > to
> > > > > tell
> > > > > us more about yourself and what you are working on.
> > > > >
> > > > > Stamatis (on behalf of the Apache Calcite PMC)
> > > > >
> > > >
> > >
> >
>