date:20200421

Re: Stored Proc to Relational Expression

2020-04-21 Thread Ravi Kapoor

Yeah Stamatis, I know that as I can see the github page for linq4j that
lots of task is in Backlog and Link-To-SQL Provider is one of them
But the intention of the project is to make java objects as Queryable talk
to DB.

Expressions and various Statement used in linq4j is to generate the
different constructs like where filter, join to talk the back-end systems
as Sql Queries.

But what about the normal While loop and If then else statements which is
not used in the Query-able objects .
Example could be below:


LOOP
  SET heads = RAND() < 0.5;
  IF heads THEN
SELECT 'Heads!';
SET heads_count = heads_count + 1;
  ELSE
SELECT 'Tails!';
BREAK;
  END IF;
END LOOP;
SELECT CONCAT(CAST(heads_count AS STRING), ' heads in a row');


How  linq4j going to convert normal while and If then else java expressions
to sql query if they are not involved in computation of sql constructs like
in lambda expression of filter.

In one of the task in backlog I could see:
*In the prototype LINQ-to-SQL provider, write a simple rule to recognize a
select list and where clause and push them down to SQL.*

I believe it was never intended for procedural code right?

Thanks,
Ravi


On Wed, Apr 22, 2020 at 2:43 AM Stamatis Zampetakis 
wrote:

> Hi Ravi,
>
> As Julian already mentioned, we are quite far from what you would like to
> achieve.
>
> Nevertheless, I would like to mention that you can model certain kind of
> loops and recursion using the RepeatUnion [1] and Spool [2] relational
> expressions. Note that these APIs are experimental.
>
> Moreover, if your end goal is to push the computation in a DBMS, I have to
> warn you that the code to translate these expressions back to SQL is not
> there yet.
>
> Best,
> Stamatis
>
> [1]
>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/core/RepeatUnion.java
> [2]
>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/core/Spool.java
>
> On Mon, Apr 20, 2020 at 8:08 PM Julian Hyde  wrote:
>
> > Calcite relational expressions can represent SELECT, INSERT etc. but not
> > procedural code. It’s a direction we could consider going.
> >
> > RexProgram is the closest thing we currently have to procedural code in
> > the algebra - single assignment of variables, use of variables in
> > expressions assigning to other variables - but it is a long way short
> > because there are no loops.
> >
> > > On Apr 19, 2020, at 12:47 PM, Ravi Kapoor 
> > wrote:
> > >
> > > Hi Team,
> > >
> > > I have my use where I need to convert my dialect specific stored
> > procedure
> > > constructs like while loop, If then else to Rel expression
> > >
> > > Basically this can contain control flow statements like below
> > >
> > > DECLARE heads BOOL;
> > > DECLARE heads_count INT64 DEFAULT 0;
> > > LOOP
> > >  SET heads = RAND() < 0.5;
> > >  IF heads THEN
> > >SELECT 'Heads!';
> > >SET heads_count = heads_count + 1;
> > >  ELSE
> > >SELECT 'Tails!';
> > >BREAK;
> > >  END IF;
> > > END LOOP;
> > > SELECT CONCAT(CAST(heads_count AS STRING), ' heads in a row');
> > >
> > >
> > > I can create a Java AST model from the linq4j provided by calcite
> however
> > > this is only going to generate Java Result and I believe its only used
> by
> > > the calcite for relational expressions of enumerable calling convention
> > > which is used by adapters which does not support core relational
> > operations
> > > right?
> > >
> > > Is there a way I can convert the stored proc constructs into some
> > canonical
> > > form like Rel Tree and back to Stored proc of target dialect.
> > > --
> > >
> > > Thanks,
> > > Ravi
> >
> >
>


-- 
Thanks,
Ravi Kapoor
+91-9818764564
kapoorrav...@gmail.com

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-21 Thread Danny Chan

> Is there any recommended approach to make that happen smoothly besides
coding and testing work? We need to be aware that the new planner might be
co-exist with VolcanoPlanner for 5 or more years, or even never replace
VolcanoPlanner.

If that is true, i might say the new planner is probably with a not that
good design, we expect to see in advance for what cases/reasons user has
the reason to keep the old VolcanoPlanner and we *must* give a solution for
those problems in the new design.

I was expecting that migrating to a new planner would at least take 1 year
for developing, if that is true, modifying directly based on current
planner means for the near future 3~4 versions Calcite, there would bring
in huge plan changes/bugs for each release which i believe all the users of
Calcite don't want to see. And on one can guarantee that modifying directly
can keep good stability and compatibility, only the test set do.

>From the experience of Alibaba Blink planner which has contributed to
Apache Flink, yes, the old/new planner would co-exist at least for 2 years.
For the reasons that the new and old planner has different ability in some
corner cases.

>From my point of view, we should at least:
- Give a convincing test set for the new planner that makes us believe the
new planner is stable and powerful enough. I mean obviously the current
rule tests are far away from enough to support the new planner
- We should give a more detailed design doc about the new planner,
especially about the interfaces changes and any change that would bring in
the compatibility problem. Then we can make more accurate decision how much
work the new planner would bring in, until then, we can decide if switch to
a pure new planner development is a good idea or modify the existing one.


Haisheng Yuan  于2020年4月22日周三 上午9:45写道：

> Hi Andrii,
>
> > Obviously, from what is written here, I could guess that this would
> require me to change my physical planning rules, even if only by
> implementing a marker interface.
> You don't need to change your physical rules, it will be treated as equal
> as logical rules and be applied together with the real logical rules, no
> more logical/physical rules difference. This is also how current
> VolcanoPlanner works.
>
> > I don't want you to think that I somehow resent the changes you are
> pushing.
> Don't get me wrong. I am seriously thinking of revert these changes, since
> most people like the idea of adding new planner, why don't we make all the
> plan changes in the new planner, instead of forcing people changing test
> cases for the code changes that they might not need in VolcanoPlanner
> during upgrade.
>
> I didn't intend to replace VolcanoPlanner, thought just change the search
> strategy and add trait derivation mechanism, because most of the code in
> VolcanoPlanner can be reused. But since many agree to add new planner and
> replace VolcanoPlanner as the final goal, I won't be against most people's
> decision.
>
> Is there any recommended approach to make that happen smoothly besides
> coding and testing work? We need to be aware that the new planner might be
> co-exist with VolcanoPlanner for 5 or more years, or even never replace
> VolcanoPlanner.
>
> More thoughts are welcome.
>
> Haisheng
>
> On 2020/04/21 19:56:25, Андрей Цвелодуб  wrote:
> > Hello Haisheng,
> >
> > > To keep backward compatibility, all the un-marked rules will be treated
> > as logical rules, except rules that uses AbstractConverter as rule
> operand,
> > these rules still need to applied top-down, or random order.
> > Obviously, from what is written here, I could guess that this would
> require
> > me to change my physical planning rules, even if only by implementing a
> > marker interface. I am not saying this is a bad thing, but this is a
> thing
> > that should be communicated and planned ahead in case the VolcanoPlanner
> is
> > modified.
> >
> > > Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> > because they will cause another tons of plan changes.
> > I see you are still bitter due to all the discussions on this list
> lately,
> > I'm sorry. I don't want you to think that I somehow resent the changes
> you
> > are pushing, au contraire I support them and would be happy to help if I
> > can. I just want the process of these changes to be executed in the best
> > possible way.
> > As I see there are already several opinions in this thread that basically
> > align with what I am saying, so I guess I am not the crazy guy running
> > around and yelling "the end is nigh!".
> >
> > Thank you for taking these mumbled thoughts into account.
> >
> > Bestest Regards,
> > Andrii Tsvielodub
> >
> > On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan  wrote:
> >
> > > Hi Andrii,
> > >
> > > > I guess changing the planner would lead to changes in tons of rules
> and
> > > even more tests.
> > > Obviously you didn't read through my email. You are not required to do
> any
> > > changes to your rule if you d

Re: Understanding annotations of SqlGroupingFunction

2020-04-21 Thread XING JIN

Hi Vineet ~
+1 on your analysis.
Checking below case in agg.iq. We can see that the behavior of GROUPING
function in Calcite is the same as Hive.

# GROUPING in SELECT clause of CUBE query
select deptno, job, count(*) as c, grouping(deptno) as d,
  grouping(job) j, grouping(deptno, job) as x
from "scott".emp
group by cube(deptno, job);
++---++---+---+---+
| DEPTNO | JOB   | C  | D | J | X |
++---++---+---+---+
| 10 | CLERK |  1 | 0 | 0 | 0 |
| 10 | MANAGER   |  1 | 0 | 0 | 0 |
| 10 | PRESIDENT |  1 | 0 | 0 | 0 |
| 10 |   |  3 | 0 | 1 | 1 |
| 20 | ANALYST   |  2 | 0 | 0 | 0 |
| 20 | CLERK |  2 | 0 | 0 | 0 |
| 20 | MANAGER   |  1 | 0 | 0 | 0 |
| 20 |   |  5 | 0 | 1 | 1 |
| 30 | CLERK |  1 | 0 | 0 | 0 |
| 30 | MANAGER   |  1 | 0 | 0 | 0 |
| 30 | SALESMAN  |  4 | 0 | 0 | 0 |
| 30 |   |  6 | 0 | 1 | 1 |
|| ANALYST   |  2 | 1 | 0 | 2 |
|| CLERK |  4 | 1 | 0 | 2 |
|| MANAGER   |  3 | 1 | 0 | 2 |
|| PRESIDENT |  1 | 1 | 0 | 2 |
|| SALESMAN  |  4 | 1 | 0 | 2 |
||   | 14 | 1 | 1 | 3 |
++---++---+---+---+

IMHO, we might rectify the doc of SqlGroupingFunction as below:

* 0 if both deptno and gender are being grouped,
* 1 if only deptno is being grouped,
* 2 if only gender is being groped,
* 3 if neither deptno nor gender are being grouped.


- Jin


Vineet G  于2020年4月22日周三 上午5:18写道：

> I expect that the user behavior for the GROUPING in both hive and calcite
> is same. It’s just the documentation which is a bit confusing.
> e.g.  comment line on grouping : if both deptno and gender are being
> grouped
>
> should really mean that the row which represents the grand total i.e
> without group by expression. will return 3.
>
> FYI Hive had its behavior fixed with
> https://issues.apache.org/jira/browse/HIVE-16102 <
> https://issues.apache.org/jira/browse/HIVE-16102> and currently uses
> GROUPING ID function.
>
> Vineet Garg
>
>
> > On Apr 21, 2020, at 10:42 AM, Julian Hyde 
> wrote:
> >
> > Suppose we have one row that represents the total for department 10, and
> another that represents the grand total of all departments. Which row would
> we say that department is “grouped” (in Calcite’s parlance) or “aggregated”
> in (Hive’s parlance)?
> >
> > I find the terms confusing. It’s possible that Calcite has them “wrong”.
> >
> > I would simply run a query like
> >
> >  SELECT deptno, job, GROUPING(deptno, job), COUNT(*)
> >  FROM emp
> >  GROUP BY CUBE (deptno, job)
> >
> > and see whether Hive and Calcite return the same result.
> >
> >> On Apr 20, 2020, at 6:58 PM, ZZY  wrote:
> >>
> >> Hi, Hyde:
> >> It's confused me that some annotations in
> >> Calcite(org.apache.calcite.sql.fun.SqlGroupingFunction.java) :
> >> /**
> >> * The {@code GROUPING} function.
> >> *
> >> * Accepts 1 or more arguments.
> >> * Example: {@code GROUPING(deptno, gender)} returns
> >> * 3 if both deptno and gender are being grouped,
> >> * 2 if only deptno is being grouped,
> >> * 1 if only gender is being groped,
> >> * 0 if neither deptno nor gender are being grouped.
> >> *
> >> * This function is defined in the SQL standard.
> >> * {@code GROUPING_ID} is a non-standard synonym.
> >> *
> >> * Some examples are in {@code agg.iq}.
> >> */
> >>
> >> The annotations above seems conflicts with other implementations like
> Hive(
> >>
> https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction
> >> )
> >>
> >> Notice that: "The grouping function indicates whether an expression in a
> >> GROUP BY clause is aggregated or not for a given row. The value 0
> >> represents a column that is part of the grouping set, while the value 1
> >> represents a column that is not part of the grouping set. "
> >>
> >>
> >> It is clearly that 0 and 1 bit have different interpretation  between
> >> annotations in Calcite and in Hive. And I did not figure out why...
> >>
> >> Any feedback can give me on this would be highly appreciated.
> >>
> >> Best regards!
>
>

[jira] [Created] (CALCITE-3949) RelDistributions.of() and RelCollations.of() should canonize trait instance

2020-04-21 Thread Haisheng Yuan (Jira)

Haisheng Yuan created CALCITE-3949:
--

 Summary: RelDistributions.of() and RelCollations.of() should 
canonize trait instance
 Key: CALCITE-3949
 URL: https://issues.apache.org/jira/browse/CALCITE-3949
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Haisheng Yuan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-21 Thread Haisheng Yuan

Hi Andrii,

> Obviously, from what is written here, I could guess that this would require 
> me to change my physical planning rules, even if only by implementing a 
> marker interface.
You don't need to change your physical rules, it will be treated as equal as 
logical rules and be applied together with the real logical rules, no more 
logical/physical rules difference. This is also how current VolcanoPlanner 
works. 

> I don't want you to think that I somehow resent the changes you are pushing.
Don't get me wrong. I am seriously thinking of revert these changes, since most 
people like the idea of adding new planner, why don't we make all the plan 
changes in the new planner, instead of forcing people changing test cases for 
the code changes that they might not need in VolcanoPlanner during upgrade.

I didn't intend to replace VolcanoPlanner, thought just change the search 
strategy and add trait derivation mechanism, because most of the code in 
VolcanoPlanner can be reused. But since many agree to add new planner and 
replace VolcanoPlanner as the final goal, I won't be against most people's 
decision. 

Is there any recommended approach to make that happen smoothly besides coding 
and testing work? We need to be aware that the new planner might be co-exist 
with VolcanoPlanner for 5 or more years, or even never replace VolcanoPlanner.

More thoughts are welcome.

Haisheng

On 2020/04/21 19:56:25, Андрей Цвелодуб  wrote: 
> Hello Haisheng,
> 
> > To keep backward compatibility, all the un-marked rules will be treated
> as logical rules, except rules that uses AbstractConverter as rule operand,
> these rules still need to applied top-down, or random order.
> Obviously, from what is written here, I could guess that this would require
> me to change my physical planning rules, even if only by implementing a
> marker interface. I am not saying this is a bad thing, but this is a thing
> that should be communicated and planned ahead in case the VolcanoPlanner is
> modified.
> 
> > Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> because they will cause another tons of plan changes.
> I see you are still bitter due to all the discussions on this list lately,
> I'm sorry. I don't want you to think that I somehow resent the changes you
> are pushing, au contraire I support them and would be happy to help if I
> can. I just want the process of these changes to be executed in the best
> possible way.
> As I see there are already several opinions in this thread that basically
> align with what I am saying, so I guess I am not the crazy guy running
> around and yelling "the end is nigh!".
> 
> Thank you for taking these mumbled thoughts into account.
> 
> Bestest Regards,
> Andrii Tsvielodub
> 
> On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan  wrote:
> 
> > Hi Andrii,
> >
> > > I guess changing the planner would lead to changes in tons of rules and
> > even more tests.
> > Obviously you didn't read through my email. You are not required to do any
> > changes to your rule if you don't want to, but if you do, just need to mark
> > the rule to tell planner whether it is a physical rule or not, simply by
> > implementing an empty interface.
> >
> > > many on this list already experienced problems with upgrading even
> > between the minor versions of Calcite.
> > Sorry to see the problem you have experienced when upgrading Calcite.
> > Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> > because they will cause another tons of plan changes.
> >
> > But I will see if I can add a setting to use the old search strategy,
> > which can be left untouched.
> >
> > Haisheng
> >
> > On 2020/04/21 06:33:08, Андрей Цвелодуб  wrote:
> > > Hello everyone,
> > >
> > > First of all, thanks for this great effort of improving the core parts of
> > > the framework we all are using,
> > > I believe this is long overdue and hope this will have benefits both for
> > > the maintainers and users of the library.
> > >
> > > I don't have anything to say about the general idea at the moment,
> > > but I want to make a point that maintaining the old implementation of
> > > VolcanoPlanner during
> > > the initial stages of implementing the new planner is absolutely
> > CRITICAL.
> > > As a lot of users of Calcite do various customizations to the engine, to
> > > the rules
> > > and all that is there in between, I believe changing the implementation
> > of
> > > the core component
> > > would have a huge impact on most users of the library. I think many on
> > this
> > > list
> > > already experienced problems with upgrading even between the minor
> > versions
> > > of Calcite,
> > > so I guess changing the planner would lead to changes in tons of rules
> > and
> > > even more tests.
> > >
> > > I don't have anything against replacing VolcanoPlanner as a final goal of
> > > this effort,
> > > but I don't think that modifying it directly and merging it to master is
> > a
> > > viable development approach.

Calcite-Master - Build # 1712 - Still Failing

2020-04-21 Thread Apache Jenkins Server

The Apache Jenkins build system has built Calcite-Master (build #1712)

Status: Still Failing

Check console output at https://builds.apache.org/job/Calcite-Master/1712/ to 
view the results.

Re: RelMetadataQuery.getRowCount stackoverflow

2020-04-21 Thread Haisheng Yuan

I think ProjectMergeRule related cyclic triggering has been walked around by 
CALCITE-3774 [1].

Haisheng

[1] https://issues.apache.org/jira/browse/CALCITE-3774

On 2020/04/21 21:20:35, Stamatis Zampetakis  wrote: 
> I've seen it also quite a few times and most often the reason was cycles in
> the subsets that were triggering some rules (e.g., ProjectMergeRule [1]) to
> fire an infinite number of times [2, 3, 4].
> 
> Best,
> Stamatis
> 
> [1]
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/ProjectMergeRule.java
> [2] https://issues.apache.org/jira/browse/CALCITE-3505
> [3] https://issues.apache.org/jira/browse/CALCITE-2223
> [4] https://issues.apache.org/jira/browse/CALCITE-3124
> 
> On Tue, Apr 21, 2020 at 6:43 AM Haisheng Yuan  wrote:
> 
> > Can you add a reproducible test case and log a JIRA? It may be different
> > with CALCITE-2057. People who is interested will investigate the issue.
> >
> > On 2020/04/21 04:24:31, JiaTao Tao  wrote:
> > > Thanks
> > > I didn't add any new rule, just these:
> > >
> > > CONSTANT_REDUCTION_RULES
> > > ABSTRACT_RELATIONAL_RULES
> > > BASE_RULES
> > > ABSTRACT_RULES
> > > ENUMERABLE_RULES
> > >
> > > So this is a bug, or it just because of the call stack is too deep(if
> > this,
> > > I can adjust JVM parameter).
> > >
> > > Regards!
> > >
> > > Aron Tao
> > >
> > >
> > > Scott Reynolds  于2020年4月21日周二 上午1:10写道：
> > >
> > > > I have had this happen numerous times when writing new planner rules.
> > Most
> > > > of the time my rule is missing some boolean logic to prevent itself
> > from
> > > > transforming the call. This results in the rule continuously
> > transforming
> > > > it's previous transformations.
> > > >
> > > > I can usually see this happening when I add a
> > > > System.out.println(RelOptUtil.dumpPlan()) to the line before the
> > > > call.transformTo(newRelationNode)
> > > >
> > > > On Mon, Apr 20, 2020 at 3:13 AM JiaTao Tao 
> > wrote:
> > > >
> > > > > Hi
> > > > > Has anyone encountered this problem before? Just a simple query(no
> > more
> > > > > than 20 lines, two joins, no union).
> > > > >
> > > > > And I see this ticket:
> > > > https://issues.apache.org/jira/browse/CALCITE-2057,
> > > > > but there's no follow up, also I see flink may occur this problem(
> > > > > https://developer.aliyun.com/ask/129548)
> > > > >
> > > > > java.lang.StackOverflowError
> > > > > at java.util.HashMap.hash(HashMap.java:339)
> > > > > at java.util.HashMap.put(HashMap.java:612)
> > > > > at
> > > > >
> > > >
> > com.google.common.collect.StandardTable.getOrCreate(StandardTable.java:165)
> > > > > at
> > > > com.google.common.collect.StandardTable.put(StandardTable.java:174)
> > > > > at
> > > > com.google.common.collect.HashBasedTable.put(HashBasedTable.java:55)
> > > > > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> > > > > at
> > > > >
> > > >
> > org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:208)
> > > > > at
> > > > >
> > > >
> > org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:72)
> > > > > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown
> > Source)
> > > > > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> > > > > at ...
> > > > >
> > > > > Regards!
> > > > >
> > > > > Aron Tao
> > > > >
> > > >
> > >
> >
>

Re: Building a Calcite Adapter

2020-04-21 Thread Stamatis Zampetakis

Hi Jon,

Thanks for your kind words. I'm sure people working on the project are very
happy to receive some positive feedback for their work from time to time :)

I had a quick look on your project and definitely looks interesting.

If your engine (Crux) uses better join algorithms than the ones provided by
Calcite and if you have an optimizer that can apply join re-ordering and
other optimization techniques efficiently then I guess going further and
pushing joins and other things to Crux is a good idea.

Having said that, I am not sure if the TranslatableTable approach will get
you much further to this direction.
I would suggest to have a look in JdbcConvention [1] and see how the notion
of Convention along with the respective rules and relational expressions
help to push operations into traditional RDBMs. The Cassandra, Mongo, and
Elastic adapters are not a very good example since the underlying engines
do not support joins.

I am not aware if there are people offering consulting services for Calcite
but I guess if there are you will know already.
Apart from that the project has many volunteers willing to help so if you
have more questions don't hesitate to send them to this list.

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/adapter/jdbc/JdbcConvention.java

On Tue, Apr 7, 2020, 12:22 PM Jon Pither  wrote:

> Hi Calcite Devs,
>
> Firstly, thank you to all of you for building this fantastic tool.
>
> I'm currently experimenting with using Calcite on top of our document
> database Crux (opencrux.com) offering bitemporal features using a Datalog
> query language. You can see our efforts here, written in Clojure!
>
>
> https://github.com/juxt/crux/blob/jp/calcite/crux-calcite/src/crux/calcite.clj
>
> https://github.com/juxt/crux/blob/jp/calcite/crux-test/test/crux/calcite_test.clj
>
> So far we've been impressed at the power Calcite gives, with such little
> amount of integration code needed.
>
> We now have an initial MVP working using the ProjectableFilterableTable
> route. The adapter is basically constructing a Datalog query that we then
> execute against our DB.
>
> So far so good, and now I have some initial questions:
>
> Firstly, in this code we're making use of ProjectableFilterableTable to get
> us up and running. I've looked at the Mongo and Elastic adapters in the
> Calcite source, and they opt for TranslatableTable which is a deeper
> integration. From I can see the immediate disadvantage of
> ProjectableFilterableTable is that it's a query per table, meaning that we
> can't efficiently delegate joins to our DB.
>
> Moving to TranslatableTable would be a significant investment for us. My
> first question is: would you encourage us to make this investment, given
> we've got something up and running using ProjectableFilterableTable, with
> Calcite doing the heavy lifting? Please could you also advise on soliciting
> mentoring / consulting to help guide us, for which we can compensate.
>
> Our next question is around temporality. I can see in the Calcite code that
> there is a concept of a TemporalTable, supporing "FOR SYSTEM_TIME AS OF X".
> It looks like we wouldn't be able to make use of this using
> ProjectableFilterableTable, at least this is my experience thus far. In
> Crux we also expose VALID_TIME to our users to be able to query for,
> whereby users can query against VALID_TIME and/or SYSTEM_TIME. How might
> you recommend we achieve this using Calcite?
>
> Thanks & Regards,
>
> Jon
>

Calcite-Master - Build # 1711 - Failure

2020-04-21 Thread Apache Jenkins Server

The Apache Jenkins build system has built Calcite-Master (build #1711)

Status: Failure

Check console output at https://builds.apache.org/job/Calcite-Master/1711/ to 
view the results.

Re: RelMetadataQuery.getRowCount stackoverflow

2020-04-21 Thread Stamatis Zampetakis

I've seen it also quite a few times and most often the reason was cycles in
the subsets that were triggering some rules (e.g., ProjectMergeRule [1]) to
fire an infinite number of times [2, 3, 4].

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/ProjectMergeRule.java
[2] https://issues.apache.org/jira/browse/CALCITE-3505
[3] https://issues.apache.org/jira/browse/CALCITE-2223
[4] https://issues.apache.org/jira/browse/CALCITE-3124

On Tue, Apr 21, 2020 at 6:43 AM Haisheng Yuan  wrote:

> Can you add a reproducible test case and log a JIRA? It may be different
> with CALCITE-2057. People who is interested will investigate the issue.
>
> On 2020/04/21 04:24:31, JiaTao Tao  wrote:
> > Thanks
> > I didn't add any new rule, just these:
> >
> > CONSTANT_REDUCTION_RULES
> > ABSTRACT_RELATIONAL_RULES
> > BASE_RULES
> > ABSTRACT_RULES
> > ENUMERABLE_RULES
> >
> > So this is a bug, or it just because of the call stack is too deep(if
> this,
> > I can adjust JVM parameter).
> >
> > Regards!
> >
> > Aron Tao
> >
> >
> > Scott Reynolds  于2020年4月21日周二 上午1:10写道：
> >
> > > I have had this happen numerous times when writing new planner rules.
> Most
> > > of the time my rule is missing some boolean logic to prevent itself
> from
> > > transforming the call. This results in the rule continuously
> transforming
> > > it's previous transformations.
> > >
> > > I can usually see this happening when I add a
> > > System.out.println(RelOptUtil.dumpPlan()) to the line before the
> > > call.transformTo(newRelationNode)
> > >
> > > On Mon, Apr 20, 2020 at 3:13 AM JiaTao Tao 
> wrote:
> > >
> > > > Hi
> > > > Has anyone encountered this problem before? Just a simple query(no
> more
> > > > than 20 lines, two joins, no union).
> > > >
> > > > And I see this ticket:
> > > https://issues.apache.org/jira/browse/CALCITE-2057,
> > > > but there's no follow up, also I see flink may occur this problem(
> > > > https://developer.aliyun.com/ask/129548)
> > > >
> > > > java.lang.StackOverflowError
> > > > at java.util.HashMap.hash(HashMap.java:339)
> > > > at java.util.HashMap.put(HashMap.java:612)
> > > > at
> > > >
> > >
> com.google.common.collect.StandardTable.getOrCreate(StandardTable.java:165)
> > > > at
> > > com.google.common.collect.StandardTable.put(StandardTable.java:174)
> > > > at
> > > com.google.common.collect.HashBasedTable.put(HashBasedTable.java:55)
> > > > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> > > > at
> > > >
> > >
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:208)
> > > > at
> > > >
> > >
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:72)
> > > > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown
> Source)
> > > > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> > > > at ...
> > > >
> > > > Regards!
> > > >
> > > > Aron Tao
> > > >
> > >
> >
>

Re: Understanding annotations of SqlGroupingFunction

2020-04-21 Thread Vineet G

I expect that the user behavior for the GROUPING in both hive and calcite is 
same. It’s just the documentation which is a bit confusing.
e.g.  comment line on grouping : if both deptno and gender are being grouped

should really mean that the row which represents the grand total i.e  without 
group by expression. will return 3.

FYI Hive had its behavior fixed with 
https://issues.apache.org/jira/browse/HIVE-16102 
 and currently uses GROUPING 
ID function.

Vineet Garg


> On Apr 21, 2020, at 10:42 AM, Julian Hyde  wrote:
> 
> Suppose we have one row that represents the total for department 10, and 
> another that represents the grand total of all departments. Which row would 
> we say that department is “grouped” (in Calcite’s parlance) or “aggregated” 
> in (Hive’s parlance)?
> 
> I find the terms confusing. It’s possible that Calcite has them “wrong”. 
> 
> I would simply run a query like
> 
>  SELECT deptno, job, GROUPING(deptno, job), COUNT(*)
>  FROM emp
>  GROUP BY CUBE (deptno, job)
> 
> and see whether Hive and Calcite return the same result. 
> 
>> On Apr 20, 2020, at 6:58 PM, ZZY  wrote:
>> 
>> Hi, Hyde:
>> It's confused me that some annotations in
>> Calcite(org.apache.calcite.sql.fun.SqlGroupingFunction.java) :
>> /**
>> * The {@code GROUPING} function.
>> *
>> * Accepts 1 or more arguments.
>> * Example: {@code GROUPING(deptno, gender)} returns
>> * 3 if both deptno and gender are being grouped,
>> * 2 if only deptno is being grouped,
>> * 1 if only gender is being groped,
>> * 0 if neither deptno nor gender are being grouped.
>> *
>> * This function is defined in the SQL standard.
>> * {@code GROUPING_ID} is a non-standard synonym.
>> *
>> * Some examples are in {@code agg.iq}.
>> */
>> 
>> The annotations above seems conflicts with other implementations like Hive(
>> https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction
>> )
>> 
>> Notice that: "The grouping function indicates whether an expression in a
>> GROUP BY clause is aggregated or not for a given row. The value 0
>> represents a column that is part of the grouping set, while the value 1
>> represents a column that is not part of the grouping set. "
>> 
>> 
>> It is clearly that 0 and 1 bit have different interpretation  between
>> annotations in Calcite and in Hive. And I did not figure out why...
>> 
>> Any feedback can give me on this would be highly appreciated.
>> 
>> Best regards!

[jira] [Created] (CALCITE-3948) Improve operand's RelSubset matching handling in VolcanoRuleCall

2020-04-21 Thread Botong Huang (Jira)

Botong Huang created CALCITE-3948:
-

 Summary: Improve operand's RelSubset matching handling in 
VolcanoRuleCall
 Key: CALCITE-3948
 URL: https://issues.apache.org/jira/browse/CALCITE-3948
 Project: Calcite
  Issue Type: Improvement
Reporter: Botong Huang


For operands matching for a RelSubset, more handling under various cases are 
needed to be consistent in VolcanoRuleCall



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Stored Proc to Relational Expression

2020-04-21 Thread Stamatis Zampetakis

Hi Ravi,

As Julian already mentioned, we are quite far from what you would like to
achieve.

Nevertheless, I would like to mention that you can model certain kind of
loops and recursion using the RepeatUnion [1] and Spool [2] relational
expressions. Note that these APIs are experimental.

Moreover, if your end goal is to push the computation in a DBMS, I have to
warn you that the code to translate these expressions back to SQL is not
there yet.

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/core/RepeatUnion.java
[2]
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/core/Spool.java

On Mon, Apr 20, 2020 at 8:08 PM Julian Hyde  wrote:

> Calcite relational expressions can represent SELECT, INSERT etc. but not
> procedural code. It’s a direction we could consider going.
>
> RexProgram is the closest thing we currently have to procedural code in
> the algebra - single assignment of variables, use of variables in
> expressions assigning to other variables - but it is a long way short
> because there are no loops.
>
> > On Apr 19, 2020, at 12:47 PM, Ravi Kapoor 
> wrote:
> >
> > Hi Team,
> >
> > I have my use where I need to convert my dialect specific stored
> procedure
> > constructs like while loop, If then else to Rel expression
> >
> > Basically this can contain control flow statements like below
> >
> > DECLARE heads BOOL;
> > DECLARE heads_count INT64 DEFAULT 0;
> > LOOP
> >  SET heads = RAND() < 0.5;
> >  IF heads THEN
> >SELECT 'Heads!';
> >SET heads_count = heads_count + 1;
> >  ELSE
> >SELECT 'Tails!';
> >BREAK;
> >  END IF;
> > END LOOP;
> > SELECT CONCAT(CAST(heads_count AS STRING), ' heads in a row');
> >
> >
> > I can create a Java AST model from the linq4j provided by calcite however
> > this is only going to generate Java Result and I believe its only used by
> > the calcite for relational expressions of enumerable calling convention
> > which is used by adapters which does not support core relational
> operations
> > right?
> >
> > Is there a way I can convert the stored proc constructs into some
> canonical
> > form like Rel Tree and back to Stored proc of target dialect.
> > --
> >
> > Thanks,
> > Ravi
>
>

[jira] [Created] (CALCITE-3947) AbstractRelOptPlanner.classes should be LinkedHashSet so that rule match order is deterministic across runs

2020-04-21 Thread Botong Huang (Jira)

Botong Huang created CALCITE-3947:
-

 Summary: AbstractRelOptPlanner.classes should be LinkedHashSet so 
that rule match order is deterministic across runs
 Key: CALCITE-3947
 URL: https://issues.apache.org/jira/browse/CALCITE-3947
 Project: Calcite
  Issue Type: Improvement
Reporter: Botong Huang


AbstractRelOptPlanner.classes is used by subClasses() to determine things to 
put into VolcanoPlanner.classOperands, which is then used in 
VolcanoPlanner.fireRules(). Since AbstractRelOptPlanner.classes is now a 
HashSet, its iteration order is not deterministic across runs, making debugging 
hard. It should be LinkedHashSet just like many other fields in the planner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-21 Thread Андрей Цвелодуб

Hello Haisheng,

> To keep backward compatibility, all the un-marked rules will be treated
as logical rules, except rules that uses AbstractConverter as rule operand,
these rules still need to applied top-down, or random order.
Obviously, from what is written here, I could guess that this would require
me to change my physical planning rules, even if only by implementing a
marker interface. I am not saying this is a bad thing, but this is a thing
that should be communicated and planned ahead in case the VolcanoPlanner is
modified.

> Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
because they will cause another tons of plan changes.
I see you are still bitter due to all the discussions on this list lately,
I'm sorry. I don't want you to think that I somehow resent the changes you
are pushing, au contraire I support them and would be happy to help if I
can. I just want the process of these changes to be executed in the best
possible way.
As I see there are already several opinions in this thread that basically
align with what I am saying, so I guess I am not the crazy guy running
around and yelling "the end is nigh!".

Thank you for taking these mumbled thoughts into account.

Bestest Regards,
Andrii Tsvielodub

On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan  wrote:

> Hi Andrii,
>
> > I guess changing the planner would lead to changes in tons of rules and
> even more tests.
> Obviously you didn't read through my email. You are not required to do any
> changes to your rule if you don't want to, but if you do, just need to mark
> the rule to tell planner whether it is a physical rule or not, simply by
> implementing an empty interface.
>
> > many on this list already experienced problems with upgrading even
> between the minor versions of Calcite.
> Sorry to see the problem you have experienced when upgrading Calcite.
> Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> because they will cause another tons of plan changes.
>
> But I will see if I can add a setting to use the old search strategy,
> which can be left untouched.
>
> Haisheng
>
> On 2020/04/21 06:33:08, Андрей Цвелодуб  wrote:
> > Hello everyone,
> >
> > First of all, thanks for this great effort of improving the core parts of
> > the framework we all are using,
> > I believe this is long overdue and hope this will have benefits both for
> > the maintainers and users of the library.
> >
> > I don't have anything to say about the general idea at the moment,
> > but I want to make a point that maintaining the old implementation of
> > VolcanoPlanner during
> > the initial stages of implementing the new planner is absolutely
> CRITICAL.
> > As a lot of users of Calcite do various customizations to the engine, to
> > the rules
> > and all that is there in between, I believe changing the implementation
> of
> > the core component
> > would have a huge impact on most users of the library. I think many on
> this
> > list
> > already experienced problems with upgrading even between the minor
> versions
> > of Calcite,
> > so I guess changing the planner would lead to changes in tons of rules
> and
> > even more tests.
> >
> > I don't have anything against replacing VolcanoPlanner as a final goal of
> > this effort,
> > but I don't think that modifying it directly and merging it to master is
> a
> > viable development approach.
> > While I understand how burdensome it is to maintain several parallel core
> > components at once
> > (we did this while moving the engine of our product to Calcite), we
> should
> > still respect those who depend
> > on it and not introduce the risks related to the development of a new
> > component into existing processing flows.
> >
> > A good model to try to follow would be the way new Garbage Collectors are
> > introduced in Java.
> > First, add it as an experimental option, then make it generally
> available,
> > then after everyone agrees
> > this is the best option - make it the default one.
> > With this approach, everyone can then move to the new planner at their
> own
> > pace, guaranteeing a smooth transition overall.
> > Yes, this could take some time, maybe even a year, but this is the price
> of
> > doing major changes in a popular framework.
> >
> > Again, thank you for initiating this discussion and leading this effort.
> >
> > Best Regards,
> > Andrii Tsvielodub
> >
> > On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu  wrote:
> >
> > > Hi, Xiening.
> > >
> > > Regarding calculating the logical cost, here are some ways I though:
> > > 1. Logical rel may implement their own computeSelfCost method. Some
> > > rels can provide such information, for example the
> > > LogicalProject/LogicalFilter contains nearly the same information as
> their
> > > physical implementations. If we don't have enough confidence, just
> return
> > > zeroCost is also OK, as it only affects pruning.
> > > 2. Logical rel  tells its parents what its physical input could be
> after
> > > implementation. Then

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-21 Thread Haisheng Yuan

Hi Andrii,

> I guess changing the planner would lead to changes in tons of rules and even 
> more tests.
Obviously you didn't read through my email. You are not required to do any 
changes to your rule if you don't want to, but if you do, just need to mark the 
rule to tell planner whether it is a physical rule or not, simply by 
implementing an empty interface.

> many on this list already experienced problems with upgrading even between 
> the minor versions of Calcite.
Sorry to see the problem you have experienced when upgrading Calcite. Looks 
like I have to revert changes in CALCITE-2970 and CALCITE-3753, because they 
will cause another tons of plan changes.

But I will see if I can add a setting to use the old search strategy, which can 
be left untouched.

Haisheng

On 2020/04/21 06:33:08, Андрей Цвелодуб  wrote: 
> Hello everyone,
> 
> First of all, thanks for this great effort of improving the core parts of
> the framework we all are using,
> I believe this is long overdue and hope this will have benefits both for
> the maintainers and users of the library.
> 
> I don't have anything to say about the general idea at the moment,
> but I want to make a point that maintaining the old implementation of
> VolcanoPlanner during
> the initial stages of implementing the new planner is absolutely CRITICAL.
> As a lot of users of Calcite do various customizations to the engine, to
> the rules
> and all that is there in between, I believe changing the implementation of
> the core component
> would have a huge impact on most users of the library. I think many on this
> list
> already experienced problems with upgrading even between the minor versions
> of Calcite,
> so I guess changing the planner would lead to changes in tons of rules and
> even more tests.
> 
> I don't have anything against replacing VolcanoPlanner as a final goal of
> this effort,
> but I don't think that modifying it directly and merging it to master is a
> viable development approach.
> While I understand how burdensome it is to maintain several parallel core
> components at once
> (we did this while moving the engine of our product to Calcite), we should
> still respect those who depend
> on it and not introduce the risks related to the development of a new
> component into existing processing flows.
> 
> A good model to try to follow would be the way new Garbage Collectors are
> introduced in Java.
> First, add it as an experimental option, then make it generally available,
> then after everyone agrees
> this is the best option - make it the default one.
> With this approach, everyone can then move to the new planner at their own
> pace, guaranteeing a smooth transition overall.
> Yes, this could take some time, maybe even a year, but this is the price of
> doing major changes in a popular framework.
> 
> Again, thank you for initiating this discussion and leading this effort.
> 
> Best Regards,
> Andrii Tsvielodub
> 
> On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu  wrote:
> 
> > Hi, Xiening.
> >
> > Regarding calculating the logical cost, here are some ways I though:
> > 1. Logical rel may implement their own computeSelfCost method. Some
> > rels can provide such information, for example the
> > LogicalProject/LogicalFilter contains nearly the same information as their
> > physical implementations. If we don't have enough confidence, just return
> > zeroCost is also OK, as it only affects pruning.
> > 2. Logical rel  tells its parents what its physical input could be after
> > implementation. Then the problem come back to calculating lower bound of a
> > physical rel.
> > There should always be ways. The only problem is how to find a pretty one.
> >
> > Regarding the risk, new planner do have different risk. It is not because
> > new planner could stop us doing something wrong but we can decide when to
> > use the new one. Some scenarios:
> > 1. If modifying the VolcanoPlanner directly, the only way user could
> > control the risk is not to upgrade calcite version until it is considered
> > stable. You know, it is quite different from keeping calcite updated and
> > switching to the new planner at a proper time.
> > 2. It is very importance for SLA control. For the important business and
> > jobs, we may keep using the old and stable planner. And use the new one
> > only for jobs that have fault tolerance. And this helps testing new planner
> > with actual scenarios.
> > 3. It is helpful when upgrading online services. When the new planner
> > happened to have some bugs, we can switch to the old planner directly
> > without rollback the whole service.
> > 4. With all these ways to prevent issues becoming disasters, we are not
> > vulnerable to making mistakes. This not only enables faster iterations but
> > also let us have enough time to resolve big bugs, like considering it in
> > detail and applying a time-consuming refactoring for it. To work around a
> > critical bug using tricky ways usually introduces more issues.
> >
> > Thanks,

[jira] [Created] (CALCITE-3946) Add parser support for MULTISET/SET and VOLATILE modifiers in CREATE TABLE statements

2020-04-21 Thread dasch (Jira)

dasch created CALCITE-3946:
--

 Summary: Add parser support for MULTISET/SET and VOLATILE 
modifiers in CREATE TABLE statements
 Key: CALCITE-3946
 URL: https://issues.apache.org/jira/browse/CALCITE-3946
 Project: Calcite
  Issue Type: Improvement
  Components: babel
Reporter: dasch


Add support to Calcite's Babel parser for MULTISET/SET and VOLATILE modifiers 
in CREATE TABLE statements.

The syntax for these statements is:
CREATE TABLE [SET|MULTISET] [VOLATILE]  [IF NOT EXISTS] 
( , ...);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Understanding annotations of SqlGroupingFunction

2020-04-21 Thread Julian Hyde

Suppose we have one row that represents the total for department 10, and 
another that represents the grand total of all departments. Which row would we 
say that department is “grouped” (in Calcite’s parlance) or “aggregated” in 
(Hive’s parlance)?

I find the terms confusing. It’s possible that Calcite has them “wrong”. 

I would simply run a query like

  SELECT deptno, job, GROUPING(deptno, job), COUNT(*)
  FROM emp
  GROUP BY CUBE (deptno, job)

and see whether Hive and Calcite return the same result. 

> On Apr 20, 2020, at 6:58 PM, ZZY  wrote:
> 
> Hi, Hyde:
> It's confused me that some annotations in
> Calcite(org.apache.calcite.sql.fun.SqlGroupingFunction.java) :
> /**
> * The {@code GROUPING} function.
> *
> * Accepts 1 or more arguments.
> * Example: {@code GROUPING(deptno, gender)} returns
> * 3 if both deptno and gender are being grouped,
> * 2 if only deptno is being grouped,
> * 1 if only gender is being groped,
> * 0 if neither deptno nor gender are being grouped.
> *
> * This function is defined in the SQL standard.
> * {@code GROUPING_ID} is a non-standard synonym.
> *
> * Some examples are in {@code agg.iq}.
> */
> 
> The annotations above seems conflicts with other implementations like Hive(
> https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction
> )
> 
> Notice that: "The grouping function indicates whether an expression in a
> GROUP BY clause is aggregated or not for a given row. The value 0
> represents a column that is part of the grouping set, while the value 1
> represents a column that is not part of the grouping set. "
> 
> 
> It is clearly that 0 and 1 bit have different interpretation  between
> annotations in Calcite and in Hive. And I did not figure out why...
> 
> Any feedback can give me on this would be highly appreciated.
> 
> Best regards!

[jira] [Created] (CALCITE-3945) Update Streaming documentation

2020-04-21 Thread Rui Wang (Jira)

Rui Wang created CALCITE-3945:
-

 Summary: Update Streaming documentation
 Key: CALCITE-3945
 URL: https://issues.apache.org/jira/browse/CALCITE-3945
 Project: Calcite
  Issue Type: Sub-task
Reporter: Rui Wang


 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3944) Move dumpSets and dumpGraphviz out of VolcanoPlanner

2020-04-21 Thread Haisheng Yuan (Jira)

Haisheng Yuan created CALCITE-3944:
--

 Summary: Move dumpSets and dumpGraphviz out of VolcanoPlanner
 Key: CALCITE-3944
 URL: https://issues.apache.org/jira/browse/CALCITE-3944
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Haisheng Yuan


They are helper methods, will move to Dumper.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CALCITE-3943) Remove the JSON functions keyword from parser

2020-04-21 Thread Danny Chen (Jira)

Danny Chen created CALCITE-3943:
---

 Summary: Remove the JSON functions keyword from parser
 Key: CALCITE-3943
 URL: https://issues.apache.org/jira/browse/CALCITE-3943
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.22.0
Reporter: Danny Chen
Assignee: Danny Chen
 Fix For: 1.23.0


The JSON functions was introduced in CALCITE-2266, we coded the JSON keywords 
into the parser and generates the operator directly, so there is no chance for 
downstream projects to change the builtin operator and do some override.

This issue tries to remove these keywords from the parser, we should always 
generates SqlUnresolvedFunction instead specific operators in the SQL parser.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-21 Thread Chunwei Lei

Haisheng and Xiening, thanks for sharing these wonderful ideas. I believe
this will be a huge improvement
and definitely benefits all users.

>From my experience of upgrading calcite version, there are always some
changes in the new version which
may lead to unexpected behavior due to a lack of enough integration tests
with other systems even though
it is a minor version. Currently, the VolcanoPlanner is proved to be
production-ready and it has to be stable
to make sure that other systems built on it would not have much trouble
after using the new calcite version.

So from my side, I wish we can provide an option to use the new planner at
least considering its huge changes.


Best,
Chunwei


On Tue, Apr 21, 2020 at 2:42 PM Андрей Цвелодуб 
wrote:

> Hello everyone,
>
> First of all, thanks for this great effort of improving the core parts of
> the framework we all are using,
> I believe this is long overdue and hope this will have benefits both for
> the maintainers and users of the library.
>
> I don't have anything to say about the general idea at the moment,
> but I want to make a point that maintaining the old implementation of
> VolcanoPlanner during
> the initial stages of implementing the new planner is absolutely CRITICAL.
> As a lot of users of Calcite do various customizations to the engine, to
> the rules
> and all that is there in between, I believe changing the implementation of
> the core component
> would have a huge impact on most users of the library. I think many on this
> list
> already experienced problems with upgrading even between the minor versions
> of Calcite,
> so I guess changing the planner would lead to changes in tons of rules and
> even more tests.
>
> I don't have anything against replacing VolcanoPlanner as a final goal of
> this effort,
> but I don't think that modifying it directly and merging it to master is a
> viable development approach.
> While I understand how burdensome it is to maintain several parallel core
> components at once
> (we did this while moving the engine of our product to Calcite), we should
> still respect those who depend
> on it and not introduce the risks related to the development of a new
> component into existing processing flows.
>
> A good model to try to follow would be the way new Garbage Collectors are
> introduced in Java.
> First, add it as an experimental option, then make it generally available,
> then after everyone agrees
> this is the best option - make it the default one.
> With this approach, everyone can then move to the new planner at their own
> pace, guaranteeing a smooth transition overall.
> Yes, this could take some time, maybe even a year, but this is the price of
> doing major changes in a popular framework.
>
> Again, thank you for initiating this discussion and leading this effort.
>
> Best Regards,
> Andrii Tsvielodub
>
> On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu  wrote:
>
> > Hi, Xiening.
> >
> > Regarding calculating the logical cost, here are some ways I though:
> > 1. Logical rel may implement their own computeSelfCost method. Some
> > rels can provide such information, for example the
> > LogicalProject/LogicalFilter contains nearly the same information as
> their
> > physical implementations. If we don't have enough confidence, just return
> > zeroCost is also OK, as it only affects pruning.
> > 2. Logical rel  tells its parents what its physical input could be after
> > implementation. Then the problem come back to calculating lower bound of
> a
> > physical rel.
> > There should always be ways. The only problem is how to find a pretty
> one.
> >
> > Regarding the risk, new planner do have different risk. It is not because
> > new planner could stop us doing something wrong but we can decide when to
> > use the new one. Some scenarios:
> > 1. If modifying the VolcanoPlanner directly, the only way user could
> > control the risk is not to upgrade calcite version until it is considered
> > stable. You know, it is quite different from keeping calcite updated and
> > switching to the new planner at a proper time.
> > 2. It is very importance for SLA control. For the important business and
> > jobs, we may keep using the old and stable planner. And use the new one
> > only for jobs that have fault tolerance. And this helps testing new
> planner
> > with actual scenarios.
> > 3. It is helpful when upgrading online services. When the new planner
> > happened to have some bugs, we can switch to the old planner directly
> > without rollback the whole service.
> > 4. With all these ways to prevent issues becoming disasters, we are not
> > vulnerable to making mistakes. This not only enables faster iterations
> but
> > also let us have enough time to resolve big bugs, like considering it in
> > detail and applying a time-consuming refactoring for it. To work around a
> > critical bug using tricky ways usually introduces more issues.
> >
> > Thanks,
> > Jinpeng
> >
> > On Tue, Apr 21, 2020 at 2:04 AM Xiening Dai  wrote:
> >

[jira] [Created] (CALCITE-3942) Move type-coercion configurations into SqlValidator.Config

2020-04-21 Thread Danny Chen (Jira)

Danny Chen created CALCITE-3942:
---

 Summary: Move type-coercion configurations into SqlValidator.Config
 Key: CALCITE-3942
 URL: https://issues.apache.org/jira/browse/CALCITE-3942
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.22.0
Reporter: Danny Chen
Assignee: Danny Chen
 Fix For: 1.23.0


SqlValidator.Config is the new role to config all kinds of configurations of 
SqlValidator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: Stored Proc to Relational Expression

Re: [DISCUSS] Towards Cascades Optimizer

Re: Understanding annotations of SqlGroupingFunction

[jira] [Created] (CALCITE-3949) RelDistributions.of() and RelCollations.of() should canonize trait instance

Re: [DISCUSS] Towards Cascades Optimizer

Calcite-Master - Build # 1712 - Still Failing

Re: RelMetadataQuery.getRowCount stackoverflow

Re: Building a Calcite Adapter

Calcite-Master - Build # 1711 - Failure

Re: RelMetadataQuery.getRowCount stackoverflow

Re: Understanding annotations of SqlGroupingFunction

[jira] [Created] (CALCITE-3948) Improve operand's RelSubset matching handling in VolcanoRuleCall

Re: Stored Proc to Relational Expression

[jira] [Created] (CALCITE-3947) AbstractRelOptPlanner.classes should be LinkedHashSet so that rule match order is deterministic across runs

Re: [DISCUSS] Towards Cascades Optimizer

Re: [DISCUSS] Towards Cascades Optimizer

[jira] [Created] (CALCITE-3946) Add parser support for MULTISET/SET and VOLATILE modifiers in CREATE TABLE statements

Re: Understanding annotations of SqlGroupingFunction

[jira] [Created] (CALCITE-3945) Update Streaming documentation

[jira] [Created] (CALCITE-3944) Move dumpSets and dumpGraphviz out of VolcanoPlanner

[jira] [Created] (CALCITE-3943) Remove the JSON functions keyword from parser

Re: [DISCUSS] Towards Cascades Optimizer

[jira] [Created] (CALCITE-3942) Move type-coercion configurations into SqlValidator.Config

23 matches

Site Navigation

Mail list logo

Footer information