Re: Trait propagation in heterogeneous plans

2021-05-06 Thread Vladimir Sitnikov
Vladimir,

I generally agree with what you are saying,

>Enumerable backend provides a clear and consistent contract: we support
collation and reset everything

That sounds like a way to go until there's a way to externalize "input
trait enforcement" rules.
"output" traits are simpler since they can be computed with metadataquery
(however, we still hard-code the set of computed traits).
It might be worth trying to compute all the traits known to the planner.

However, Enumerable could play well with in-core distribution trait as
well, so there's no need to limit enumerable to "collation only".

If you don't like in-core distribution trait, you just do not use it.
There's no much sense in limiting enumerable to collation only.

Vladimir


[jira] [Created] (CALCITE-4598) Forbid the use of System.out

2021-05-06 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-4598:
--

 Summary: Forbid the use of System.out
 Key: CALCITE-4598
 URL: https://issues.apache.org/jira/browse/CALCITE-4598
 Project: Calcite
  Issue Type: Improvement
Affects Versions: 1.26.0
Reporter: Vladimir Sitnikov


System.out.println often results in useless logging which is impossible to 
disable in the client code.

We should remove all the uses of System.out.println and add something like 
jdk-system-out from forbiddenApis or 
https://errorprone.info/bugpattern/SystemOut

I'm inclined to errorprone since it provides a clear way to suppress the 
warning (add annotation).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Trait propagation in heterogeneous plans

2021-05-06 Thread Vladimir Ozerov
Hi,

I'd like to stress out that I am not trying to argue about subjective
concepts at all. Quite the opposite - I would like to agree or disagree on
a set of objective facts and find the solution. Specifically, from what I
saw in Calcite's codebase and real projects, I assert the following:

   1. Calcite-based projects may use custom traits.
   2. Enumerable in its current state cannot propagate any traits except
   for collation. The relevant code is simply missing from the product, it was
   never implemented.
   3. Despite (2), Enumerable rules/operators may demand unsupported traits
   from inputs, or expose unsupported traits, which may lead to problems on
   the user side (an example is in the first message of this thread).

Do you agree with these points?

If we are in agreement here, then I propose only one thing - fix (3),
because it affects real-life integrations. The fix is trivial:

   - Make sure that Enumerable operators never set non-default trait values
   for anything except for collation. For example, EnumerableProjectRule
   creates an operator with the correct trait set, whilst
   EnumerableFilterRule propagates unsupported traits.
   - Replace RelNode.getTraitSet with RelOptCluster.traitSet when deducing
   the desired input trait set in Enumerable rules.

These two fixes would ensure that we never have any non-default values of
any traits except for collation in Enumerable operators. On the one hand,
it fixes (3). On the other hand, it doesn't break anything, because thanks
to (2) there is nothing to break.

Does it make sense to you?

Regards,
Vladimir.


чт, 6 мая 2021 г. в 10:35, Vladimir Sitnikov :

> Vladimir,
>
> I generally agree with what you are saying,
>
> >Enumerable backend provides a clear and consistent contract: we support
> collation and reset everything
>
> That sounds like a way to go until there's a way to externalize "input
> trait enforcement" rules.
> "output" traits are simpler since they can be computed with metadataquery
> (however, we still hard-code the set of computed traits).
> It might be worth trying to compute all the traits known to the planner.
>
> However, Enumerable could play well with in-core distribution trait as
> well, so there's no need to limit enumerable to "collation only".
>
> If you don't like in-core distribution trait, you just do not use it.
> There's no much sense in limiting enumerable to collation only.
>
> Vladimir
>


Re: Trait propagation in heterogeneous plans

2021-05-06 Thread Vladimir Sitnikov
>Enumerable in its current state cannot propagate any traits except for
collation

Enumerable can propagate in-core distribution trait.

Vladimir


Re: Trait propagation in heterogeneous plans

2021-05-06 Thread Vladimir Ozerov
It may propagate the in-core distribution in theory, if the relevant code
exists. Practically, there is no such code. For example, consider
EnumerableProject:

   1. EnumerableProjectRule.convert doesn't propagate input's distribution,
   thanks to EnumerableProject.create that uses RelOptCluster.traitSet.
   2. EnumerableProjectRule.derive also ignores all traits except for
   collation.

Therefore, irrespective of which trait set is present in the project's
input, the EnumerableProject will always have the default values for all
traits except for collation. This is what I refer to as "no trait
propagation". In this sense, EnumerableProject is an example of the correct
implementation wrt my proposal. But not all operators follow this, e.g.
EnumerableFilter.

чт, 6 мая 2021 г. в 14:39, Vladimir Sitnikov :

> >Enumerable in its current state cannot propagate any traits except for
> collation
>
> Enumerable can propagate in-core distribution trait.
>
> Vladimir
>


Re: Question about Calcite History

2021-05-06 Thread Michael Mior
1. The project entered the Apache incubator in 2014.
2. Optiq was the original name of the Calcite project. The name was
changed, but the project is the same and the current code base is
derived from the code of Optiq.
3. Calcite development is volunteer-driven so what features will be
implemented depends entirely on what volunteers choose to work on.
Some areas that have seen a lot of work over the past couple years are
JSON, geospatial, and streaming query support although there are many
other interesting things upcoming.

This is by no means an exhaustive list, but you can see new features
which have been logged in our issue tracker below:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CALCITE%20AND%20issuetype%20%3D%20%22New%20Feature%22%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
--
Michael Mior
mm...@apache.org

Le jeu. 6 mai 2021 à 00:23, Junwen Liu  a écrit :
>
> Hi Mr. or Ms. :
> I'm an engineer who is using Calcite as our Optimizer in our
> project.Calcite is an amazing framework that can meet our demands. But I
> want to know mostly is the history of Calcite, we also want to create an
> open-source project. So I want to know these things about calcite:
> 1. When did you start this project?
> 2. What the difference between optiq and Calcite?
> 3. Where Calcite will go, what features Calcite will support?


Re: Question about Calcite History

2021-05-06 Thread Michael Mior
I should add that while the project entered the incubator in 2014, I
believe the development started a couple years earlier, but I'll leave
Julian Hyde (the original author of Optiq) to answer.
--
Michael Mior
mm...@apache.org

Le jeu. 6 mai 2021 à 09:47, Michael Mior  a écrit :
>
> 1. The project entered the Apache incubator in 2014.
> 2. Optiq was the original name of the Calcite project. The name was
> changed, but the project is the same and the current code base is
> derived from the code of Optiq.
> 3. Calcite development is volunteer-driven so what features will be
> implemented depends entirely on what volunteers choose to work on.
> Some areas that have seen a lot of work over the past couple years are
> JSON, geospatial, and streaming query support although there are many
> other interesting things upcoming.
>
> This is by no means an exhaustive list, but you can see new features
> which have been logged in our issue tracker below:
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CALCITE%20AND%20issuetype%20%3D%20%22New%20Feature%22%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> --
> Michael Mior
> mm...@apache.org
>
> Le jeu. 6 mai 2021 à 00:23, Junwen Liu  a écrit :
> >
> > Hi Mr. or Ms. :
> > I'm an engineer who is using Calcite as our Optimizer in our
> > project.Calcite is an amazing framework that can meet our demands. But I
> > want to know mostly is the history of Calcite, we also want to create an
> > open-source project. So I want to know these things about calcite:
> > 1. When did you start this project?
> > 2. What the difference between optiq and Calcite?
> > 3. Where Calcite will go, what features Calcite will support?


Re: Question about Calcite History

2021-05-06 Thread Julian Hyde
I gave a talk in January about the evaluation of Calcite, as a technology and 
as a project.

https://www.youtube.com/watch?v=hf1wAnoly4g 




> On May 6, 2021, at 6:48 AM, Michael Mior  wrote:
> 
> I should add that while the project entered the incubator in 2014, I
> believe the development started a couple years earlier, but I'll leave
> Julian Hyde (the original author of Optiq) to answer.
> --
> Michael Mior
> mm...@apache.org
> 
> Le jeu. 6 mai 2021 à 09:47, Michael Mior  a écrit :
>> 
>> 1. The project entered the Apache incubator in 2014.
>> 2. Optiq was the original name of the Calcite project. The name was
>> changed, but the project is the same and the current code base is
>> derived from the code of Optiq.
>> 3. Calcite development is volunteer-driven so what features will be
>> implemented depends entirely on what volunteers choose to work on.
>> Some areas that have seen a lot of work over the past couple years are
>> JSON, geospatial, and streaming query support although there are many
>> other interesting things upcoming.
>> 
>> This is by no means an exhaustive list, but you can see new features
>> which have been logged in our issue tracker below:
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CALCITE%20AND%20issuetype%20%3D%20%22New%20Feature%22%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
>> --
>> Michael Mior
>> mm...@apache.org
>> 
>> Le jeu. 6 mai 2021 à 00:23, Junwen Liu  a écrit :
>>> 
>>> Hi Mr. or Ms. :
>>> I'm an engineer who is using Calcite as our Optimizer in our
>>> project.Calcite is an amazing framework that can meet our demands. But I
>>> want to know mostly is the history of Calcite, we also want to create an
>>> open-source project. So I want to know these things about calcite:
>>> 1. When did you start this project?
>>> 2. What the difference between optiq and Calcite?
>>> 3. Where Calcite will go, what features Calcite will support?



[jira] [Created] (CALCITE-4599) Is there any plan to support "date histogram aggregation"?

2021-05-06 Thread Jacky Yin (Jira)
Jacky Yin created CALCITE-4599:
--

 Summary: Is there any plan to support "date histogram aggregation"?
 Key: CALCITE-4599
 URL: https://issues.apache.org/jira/browse/CALCITE-4599
 Project: Calcite
  Issue Type: Improvement
  Components: elasticsearch-adapter
Reporter: Jacky Yin


"date histogram aggregation" is one of the popular analysis function of elastic 
search. It is not pushed down to elastic search in current calcite es adapter. 
Is there any plan to support it? Given the below example, if it can be pushed 
down to es, the query should be very efficient. 

"select count(*) as cc from t group by date_histogram(`@timestamp`, interval 
'5' minute)" 

Another question is currently there seems no proper sql function/keyword for 
date_histogram.  One possible option is TUMBLE function. Is it right? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)