All of the two tickets have been fixed on current master! The former was a regression The latter was an improvement in Calcite that needed only a fix in a test in HerdDB suite check the JIRA for more details
We are re running all of the tests locally, of HerdDB and of some known downstream applications Thank you ! Enrico Il giorno mer 13 mag 2020 alle ore 15:05 Enrico Olivelli < eolive...@gmail.com> ha scritto: > Tickets: > https://issues.apache.org/jira/browse/CALCITE-3997 > https://issues.apache.org/jira/browse/CALCITE-3998 > > I will try to create the reproducer, but maybe you will be smarter than me > :-) > > > Enrico > > Il giorno mer 13 mag 2020 alle ore 14:44 Haisheng Yuan <hy...@apache.org> > ha scritto: > >> > Yesterday I was trying to create a test case in Calcite codebase. >> > But I did not find where to put it. >> > Can you please give me an hint? >> Maybe JdbcTest.java, take testMergeJoin() as an example. >> >> > Otherwise I will try to create a minimal Java block of code that >> reproduces >> > the problem. I did that way last time and Stamatis was able to create >> the >> > test in Calcite code >> > >> > Does this approach work for you? >> That would also work. >> >> Thanks, >> Haisheng >> On 2020/05/13 12:31:26, Enrico Olivelli <eolive...@gmail.com> wrote: >> > Il Mer 13 Mag 2020, 13:45 Haisheng Yuan <hy...@apache.org> ha scritto: >> > >> > > Hi Enrico, >> > > >> > > > Is it possibile to disable it? I will check. Any suggestion is >> welcome >> > > Disabling it won't help. It is a Calcite bug. There is nothing wrong >> in >> > > HerdDB. Can you help us log a JIRA and provide a reproducible test >> case? >> > > >> > >> > Sorry for the delay. >> > I has another problem today. I will do as soon as possible. >> > >> > Yesterday I was trying to create a test case in Calcite codebase. >> > But I did not find where to put it. >> > Can you please give me an hint? >> > Otherwise I will try to create a minimal Java block of code that >> reproduces >> > the problem. I did that way last time and Stamatis was able to create >> the >> > test in Calcite code >> > >> > Does this approach work for you? >> > >> > Enrico >> > >> > >> > > > Do you think that I can safely disable those rules? >> > > You have to create your own rule instances. But let Calcite do it for >> you. >> > > >> > > Thanks, >> > > Haisheng Yuan >> > > >> > > On 2020/05/13 08:15:30, Enrico Olivelli <eolive...@gmail.com> wrote: >> > > > Haisheng, >> > > > >> > > > >> > > > >> > > > >> > > > Il Mar 12 Mag 2020, 16:38 Haisheng Yuan <hy...@apache.org> ha >> scritto: >> > > > >> > > > > Hi Enrico, >> > > > > >> > > > > Thanks for reporting issues so quick for calcite-1.23.0-rc0. >> > > Appreciate it. >> > > > > Can you log JIRA for these issues? We will fix them. >> > > > > >> > > > Doing it non >> > > > >> > > > >> > > > > Regarding with the first issue, I guess several factors are >> > > contributing >> > > > > to the issue. >> > > > > 1. Trait enforcement is enabled for EnumerableConvention by >> default in >> > > > > 1.23.0, now it can generate mergejoins. We can disable it again if >> > > people >> > > > > would like. >> > > > > >> > > > >> > > > Is it possibile to disable it? I will check. Any suggestion is >> welcome >> > > > >> > > > >> > > > > 2. RelBuilder hasn't been able to handle physical operator's >> trait well >> > > > > yet, especially for Project. >> > > > > >> > > > > 3. Logical operator has been doing some work that it is not >> expected to >> > > > > do, but physical operator should do. Here when creating >> > > LogicalProject, it >> > > > > is trying to deduce its collation from input MergeJoin. Project >> is a >> > > > > frequently created operator, but profiler shows that >> > > > > RelTraitSet.replaceIfs() take 65% in the total runtime of >> > > > > LogicalProject.create(). That is not only inappropriate >> operation, but >> > > also >> > > > > time-wasting operation. >> > > > > >> > > > > 4. Transformation rules can match with physical operator. In this >> case, >> > > > > JoinPushExpressionsRule matched with EnumerableMergeJoin, but the >> rule >> > > > > can't deal with physical operator well, because the traits is not >> > > properly >> > > > > handled. This not only happens on JoinPushExpressionsRule, if you >> > > tweak the >> > > > > query, you might be able to see similar assertion error when >> applying >> > > rule >> > > > > FilterIntoJoinRule. The problem has been there since their >> inception, >> > > but >> > > > > it is just disclosed today by HerdDB, does that mean no one use >> > > Calcite's >> > > > > default rule implementation to match trait aware physical >> operators, >> > > > > intentionally? Can we safely stop matching physical operators in >> these >> > > > > rules? (ProjectMerge can be an exception, some people use it on >> > > physical >> > > > > operator for post processing). >> > > > > >> > > > >> > > > Do you think that I can safely disable those rules? >> > > > >> > > > Enrico >> > > > >> > > > >> > > > > Thanks, >> > > > > Haisheng >> > > > > >> > > > > >> > > > > On 2020/05/12 09:10:31, Enrico Olivelli <eolive...@gmail.com> >> wrote: >> > > > > > Haisheng, >> > > > > > I am sorry, I have a couple of problems with HerdDB. >> > > > > > >> > > > > > 1) JOIN order unsorted columns in presence of a WHERE over other >> > > columns >> > > > > > This is my case: >> > > > > > >> > > > > > CREATE TABLE tblspace1.table1 (k1 string primary key,n1 int,s1 >> > > string) >> > > > > > CREATE TABLE tblspace1.table3 (k1 string primary key,n3 int,s3 >> > > string) >> > > > > > SELECT t1.k1 as first, t2.k1 as second >> > > > > > FROM tblspace1.table1 t1 >> > > > > > INNER JOIN tblspace1.table3 t2 ON t1.k1=t2.k1 >> > > > > > WHERE t1.n1 + 1 = t2.n3 >> > > > > > >> > > > > > In this case for table1 and table3 no column is physically >> sorted (no >> > > > > > column with a collation) >> > > > > > >> > > > > > I have this Planner error: >> > > > > > java.lang.AssertionError: cannot merge join: left input is not >> > > sorted on >> > > > > > left keys >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.rel.metadata.RelMdCollation.mergeJoin(RelMdCollation.java:457) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.rel.metadata.RelMdCollation.collations(RelMdCollation.java:153) >> > > > > > at GeneratedMetadataHandler_Collation.collations_$(Unknown >> Source) >> > > > > > at GeneratedMetadataHandler_Collation.collations(Unknown Source) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.rel.metadata.RelMetadataQuery.collations(RelMetadataQuery.java:539) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.rel.metadata.RelMdCollation.project(RelMdCollation.java:273) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.rel.logical.LogicalProject.lambda$create$0(LogicalProject.java:122) >> > > > > > at >> > > org.apache.calcite.plan.RelTraitSet.replaceIfs(RelTraitSet.java:242) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.rel.logical.LogicalProject.create(LogicalProject.java:121) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.rel.logical.LogicalProject.create(LogicalProject.java:111) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.rel.core.RelFactories$ProjectFactoryImpl.createProject(RelFactories.java:172) >> > > > > > at >> org.apache.calcite.tools.RelBuilder.project_(RelBuilder.java:1464) >> > > > > > at >> org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1258) >> > > > > > at >> org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1230) >> > > > > > at >> org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1219) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.plan.RelOptUtil.pushDownJoinConditions(RelOptUtil.java:3620) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.rel.rules.JoinPushExpressionsRule.onMatch(JoinPushExpressionsRule.java:59) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:221) >> > > > > > at >> > > > > > >> > > > > >> > > >> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:519) >> > > > > > at herddb.sql.CalcitePlanner.runPlanner(CalcitePlanner.java:535) >> > > > > > at herddb.sql.CalcitePlanner.translate(CalcitePlanner.java:292) >> > > > > > >> > > > > > *If I remove the "WHERE" clause then no error is reported.* >> > > > > > we have many other test cases about JOINs and this one is the >> only >> > > one >> > > > > > that fails >> > > > > > >> > > > > > This is the failing test case on HerdDB >> > > > > > >> > > > > >> > > >> https://github.com/diennea/herddb/blob/vote-calcite-123/herddb-core/src/test/java/herddb/core/SimpleJoinTest.java#L522 >> > > > > > >> > > > > > We are using the default set of rules >> > > Programs.ofRules(Programs.RULE_SET) >> > > > > > >> > > > > > I will try to create a reproducer in Calcite core test suite, in >> > > order to >> > > > > > understand if it is a bug in HerdDB or in Calcite >> > > > > > but I am reporting the problem as early as possible. >> > > > > > We wanted to create a daily job that tests HerdDB against >> current >> > > Calcite >> > > > > > master but unfortunately we still have not find the time to do >> it. >> > > > > > >> > > > > > 2) Changed the data type of sum(N) from BIGINT to INTEGER >> > > > > > >> > > > > > I also noted that sometimes the type of sum(N) where N is an >> INTEGER >> > > > > column >> > > > > > sometimes it is now reported by Calcite as INTEGER and >> sometimes as >> > > > > > a BIGINT. In 1.22 every time is reported as BIGINT. >> > > > > > So we have another test failing. >> > > > > > >> > > > > > SELECT sum(n1), count(*) as cc, k1 >> > > > > > FROM tblspace1.tsql >> > > > > > GROUP by k1 >> > > > > > ORDER BY sum(n1) >> > > > > > >> > > > > > Here sum(n1) is reported now a INTEGER, previously it was a >> BIGINT. I >> > > > > would >> > > > > > prefer to see it as a BIGINT in order to prevent overflows >> > > > > > >> > > > > > Here are the plans: >> > > > > > INFO: Query: SELECT sum(n1), count(*) as cc, k1 FROM >> tblspace1.tsql >> > > > > GROUP >> > > > > > by k1 ORDER BY sum(n1) -- Logical Plan >> > > > > > LogicalSort(sort0=[$0], dir0=[ASC]): rowcount = 2.0, cumulative >> cost >> > > = >> > > > > > {10.525000095367432 rows, 37.0 cpu, 0.0 io}, id = 1038 >> > > > > > LogicalProject(EXPR$0=[$1], CC=[$2], K1=[$0]): rowcount = 2.0, >> > > > > cumulative >> > > > > > cost = {8.525000095367432 rows, 13.0 cpu, 0.0 io}, id = 1037 >> > > > > > LogicalAggregate(group=[{0}], EXPR$0=[SUM($1)], >> CC=[COUNT()]): >> > > > > rowcount >> > > > > > = 2.0, cumulative cost = {6.525000095367432 rows, 7.0 cpu, 0.0 >> io}, >> > > id = >> > > > > > 1035 >> > > > > > LogicalProject(K1=[$0], n1=[$1]): rowcount = 2.0, >> cumulative >> > > cost = >> > > > > > {4.0 rows, 7.0 cpu, 0.0 io}, id = 1034 >> > > > > > LogicalTableScan(table=[[tblspace1, tsql]]): rowcount = >> 2.0, >> > > > > > cumulative cost = {2.0 rows, 3.0 cpu, 0.0 io}, id = 1032 >> > > > > > >> > > > > > May 12, 2020 11:07:37 AM herddb.sql.CalcitePlanner runPlanner >> > > > > > INFO: Query: SELECT sum(n1), count(*) as cc, k1 FROM >> tblspace1.tsql >> > > > > GROUP >> > > > > > by k1 ORDER BY sum(n1) -- Best Plan >> > > > > > EnumerableSort(sort0=[$0], dir0=[ASC]): rowcount = 2.0, >> cumulative >> > > cost = >> > > > > > {5.0 rows, 31.0 cpu, 0.0 io}, id = 1245 >> > > > > > EnumerableProject(EXPR$0=[$1], CC=[1:BIGINT], K1=[$0]): >> rowcount = >> > > 2.0, >> > > > > > cumulative cost = {3.0 rows, 7.0 cpu, 0.0 io}, id = 1244 >> > > > > > EnumerableInterpreter: rowcount = 2.0, cumulative cost = >> {1.0 >> > > rows, >> > > > > 1.0 >> > > > > > cpu, 0.0 io}, id = 1243 >> > > > > > BindableTableScan(table=[[tblspace1, tsql]], projects=[[0, >> > > 1]]): >> > > > > > rowcount = 2.0, cumulative cost = {0.016 rows, 0.024 cpu, 0.0 >> io}, >> > > id = >> > > > > 1055 >> > > > > > >> > > > > > >> > > > > > Within the same test case with the same tables the result of >> this >> > > query >> > > > > is >> > > > > > not changed >> > > > > > SELECT sum(n1) as ss, min(n1) as mi, max(n1) as ma FROM >> > > tblspace1.tsql >> > > > > > INFO: Query: SELECT sum(n1) as ss, min(n1) as mi, max(n1) as ma >> FROM >> > > > > > tblspace1.tsql -- Logical Plan >> > > > > > LogicalAggregate(group=[{}], SS=[SUM($0)], MI=[MIN($0)], >> > > MA=[MAX($0)]): >> > > > > > rowcount = 1.0, cumulative cost = {5.387500047683716 rows, 5.0 >> cpu, >> > > 0.0 >> > > > > > io}, id = 1253 >> > > > > > LogicalProject(n1=[$1]): rowcount = 2.0, cumulative cost = >> {4.0 >> > > rows, >> > > > > 5.0 >> > > > > > cpu, 0.0 io}, id = 1252 >> > > > > > LogicalTableScan(table=[[tblspace1, tsql]]): rowcount = 2.0, >> > > > > cumulative >> > > > > > cost = {2.0 rows, 3.0 cpu, 0.0 io}, id = 1250 >> > > > > > >> > > > > > May 12, 2020 11:08:48 AM herddb.sql.CalcitePlanner runPlanner >> > > > > > INFO: Query: SELECT sum(n1) as ss, min(n1) as mi, max(n1) as ma >> FROM >> > > > > > tblspace1.tsql -- Best Plan >> > > > > > EnumerableAggregate(group=[{}], SS=[SUM($0)], MI=[MIN($0)], >> > > > > MA=[MAX($0)]): >> > > > > > rowcount = 1.0, cumulative cost = {2.387500047683716 rows, 1.0 >> cpu, >> > > 0.0 >> > > > > > io}, id = 1295 >> > > > > > EnumerableInterpreter: rowcount = 2.0, cumulative cost = {1.0 >> > > rows, 1.0 >> > > > > > cpu, 0.0 io}, id = 1294 >> > > > > > BindableTableScan(table=[[tblspace1, tsql]], >> projects=[[1]]): >> > > > > rowcount >> > > > > > = 2.0, cumulative cost = {0.012 rows, 0.018000000000000002 cpu, >> 0.0 >> > > io}, >> > > > > id >> > > > > > = 1265 >> > > > > > >> > > > > > This is the test on HerdDB >> > > > > > >> > > > > >> > > >> https://github.com/diennea/herddb/blob/vote-calcite-123/herddb-core/src/test/java/herddb/sql/SimplerPlannerTest.java#L237 >> > > > > > >> > > > > > I hope that helps >> > > > > > Enrico >> > > > > > >> > > > > > >> > > > > > Il giorno mar 12 mag 2020 alle ore 07:59 Haisheng Yuan < >> > > hy...@apache.org >> > > > > > >> > > > > > ha scritto: >> > > > > > >> > > > > > > Hi all, >> > > > > > > >> > > > > > > I have created a build for Apache Calcite 1.23.0, release >> > > > > > > candidate 0. >> > > > > > > >> > > > > > > Thanks to everyone who has contributed to this release. >> > > > > > > >> > > > > > > You can read the release notes here: >> > > > > > > >> > > > > > > >> > > > > >> > > >> https://github.com/apache/calcite/blob/calcite-1.23.0-rc0/site/_docs/history.md >> > > > > > > >> > > > > > > The commit to be voted upon: >> > > > > > > >> > > > > > > >> > > > > >> > > >> https://gitbox.apache.org/repos/asf?p=calcite.git;a=commit;h=edc37c0a21344a48b15877788e082c8acdc7b030 >> > > > > > > >> > > > > > > Its hash is edc37c0a21344a48b15877788e082c8acdc7b030 >> > > > > > > >> > > > > > > Tag: >> > > > > > > https://github.com/apache/calcite/tree/calcite-1.23.0-rc0 >> > > > > > > >> > > > > > > The artifacts to be voted on are located here: >> > > > > > > >> > > > > >> > > >> https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.23.0-rc0 >> > > > > > > (revision 39385) >> > > > > > > >> > > > > > > The hashes of the artifacts are as follows: >> > > > > > > >> > > > > > > >> > > > > >> > > >> 7482b0bb76e672a15bbe846f2dbdc125bd0f3d8a32abf0ea9159b5db0ab2a2d1182e19b408098ecd68d7cc9ff5d7812ea0b33e4aeac818d191b695d437fa1a94 >> > > > > > > *apache-calcite-1.23.0-src.tar.gz >> > > > > > > >> > > > > > > A staged Maven repository is available for review at: >> > > > > > > >> > > > > > > >> > > > > >> > > >> https://repository.apache.org/content/repositories/orgapachecalcite-1088/org/apache/calcite/ >> > > > > > > >> > > > > > > Release artifacts are signed with the following key: >> > > > > > > https://people.apache.org/keys/committer/hyuan.asc >> > > > > > > https://dist.apache.org/repos/dist/release/calcite/KEYS >> > > > > > > >> > > > > > > N.B. >> > > > > > > To create the jars and test Apache Calcite: "./gradlew build". >> > > > > > > >> > > > > > > If you do not have a Java environment available, you can run >> the >> > > tests >> > > > > > > using docker. To do so, install docker and docker-compose, >> then run >> > > > > > > "docker-compose run test" from the root of the directory. >> > > > > > > >> > > > > > > Please vote on releasing this package as Apache Calcite >> 1.23.0. >> > > > > > > >> > > > > > > The vote is open for the next 72 hours and passes if a >> majority of >> > > at >> > > > > > > least three +1 PMC votes are cast. >> > > > > > > >> > > > > > > [ ] +1 Release this package as Apache Calcite 1.23.0 >> > > > > > > [ ] 0 I don't feel strongly about it, but I'm okay with the >> > > release >> > > > > > > [ ] -1 Do not release this package because... >> > > > > > > >> > > > > > > >> > > > > > > Here is my vote: >> > > > > > > >> > > > > > > +1 (binding) >> > > > > > > >> > > > > > > Thanks, >> > > > > > > Haisheng Yuan >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> >