[jira] [Created] (CALCITE-4819) SemiJoin operator is not skipped in materialized view-based rewriting algorithm
Jesus Camacho Rodriguez created CALCITE-4819: Summary: SemiJoin operator is not skipped in materialized view-based rewriting algorithm Key: CALCITE-4819 URL: https://issues.apache.org/jira/browse/CALCITE-4819 Project: Calcite Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez This had been solved in CALCITE-2277 but apparently it was broken in CALCITE-2696 (MaterializedViewRelOptRulesTest.testJoinMaterialization11 result was modified assuming the change was correct). The rewriting algorithm does not support semijoin so this can lead to incorrect results. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4716) ClassCastException converting SARG in RelNode to SQL
Jesus Camacho Rodriguez created CALCITE-4716: Summary: ClassCastException converting SARG in RelNode to SQL Key: CALCITE-4716 URL: https://issues.apache.org/jira/browse/CALCITE-4716 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez The stacktrace is the following: {noformat} class org.apache.calcite.rex.RexLocalRef cannot be cast to class org.apache.calcite.rex.RexLiteral (org.apache.calcite.rex.RexLocalRef and org.apache.calcite.rex.RexLiteral are in unnamed module of loader 'app') java.lang.ClassCastException: class org.apache.calcite.rex.RexLocalRef cannot be cast to class org.apache.calcite.rex.RexLiteral (org.apache.calcite.rex.RexLocalRef and org.apache.calcite.rex.RexLiteral are in unnamed module of loader 'app') at org.apache.calcite.rel.rel2sql.SqlImplementor$Context.toSql(SqlImplementor.java:695) at org.apache.calcite.rel.rel2sql.SqlImplementor$Context.toSql(SqlImplementor.java:597) ... {noformat} The relevant expressions in the Calc operator are the following: {code} ...expr#5=[Sarg[(10..11]]], expr#6=[SEARCH($t0, $t5)]... {code} The current code in {{SqlImplementor}} considers the second argument to SEARCH is always a RexLiteral: {code} ... case SEARCH: final RexCall search = (RexCall) rex; literal = (RexLiteral) search.operands.get(1); final Sarg sarg = castNonNull(literal.getValueAs(Sarg.class)); //noinspection unchecked return toSql(program, search.operands.get(0), literal.getType(), sarg); ... {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4499) FilterJoinRule misses opportunity to push filter to semijoin input
Jesus Camacho Rodriguez created CALCITE-4499: Summary: FilterJoinRule misses opportunity to push filter to semijoin input Key: CALCITE-4499 URL: https://issues.apache.org/jira/browse/CALCITE-4499 Project: Calcite Issue Type: Improvement Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Consider the following plan: {code} LogicalProject(DNAME=[$1]) LogicalJoin(condition=[AND(=($0, $10), =($8, 100))], joinType=[semi]) LogicalTableScan(table=[[scott, DEPT]]) LogicalTableScan(table=[[scott, EMP]]) {code} The second conjunct only refers to columns from the right input, thus it could be pushed to the right input. However, {{FilterJoinRule}} misses this opportunity. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4467) Incorrect simplification for 'NaN' value
Jesus Camacho Rodriguez created CALCITE-4467: Summary: Incorrect simplification for 'NaN' value Key: CALCITE-4467 URL: https://issues.apache.org/jira/browse/CALCITE-4467 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez {{RexSimplify}} simplifies {{x = x}} to {{null or x is not null}} (similarly <= and >=), and {{x != x}} to {{null and x is null}} (similarly < and >). https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L363 This may not be applicable in some cases. For instance, if the type of x is floating-point, x could be 'NaN'. While some RDBMS consider 'NaN' = 'NaN' (e.g., Postgres), some others consider 'NaN' != 'NaN' following the IEEE 754 standard. For the latest, the rewriting above will result in incorrect results. I think we should simply ignore this simplification for floating-point type. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3999) Simplify DialectPool implementation
Jesus Camacho Rodriguez created CALCITE-3999: Summary: Simplify DialectPool implementation Key: CALCITE-3999 URL: https://issues.apache.org/jira/browse/CALCITE-3999 Project: Calcite Issue Type: Improvement Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez JdbcUtils contains a pool to cache SqlDialect objects. Currently, it relies on multiple maps and a synchronized {{get}} method. Although I am not very familiar with that code, it seems the implementation could be made simpler and more efficient by using a Guava cache. In addition, since we would not have a single synchronized get method, multiple threads could concurrently create dialects for distinct data sources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3982) FilterMergeRule can lead to AssertionError
Jesus Camacho Rodriguez created CALCITE-3982: Summary: FilterMergeRule can lead to AssertionError Key: CALCITE-3982 URL: https://issues.apache.org/jira/browse/CALCITE-3982 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez This could potentially happen since Filter creation has a check on whether the expression is flat ([here|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/core/Filter.java#L74]) and Filter merge does not flatten an expression when it is created. {noformat} java.lang.AssertionError: AND(=($3, 100), OR(OR(null, IS NOT NULL(CAST(100):INTEGER)), =(CAST(100):INTEGER, CAST(200):INTEGER))) at org.apache.calcite.rel.core.Filter.(Filter.java:74) at org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveFilter.(HiveFilter.java:39) at org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveFilterFactoryImpl.createFilter(HiveRelFactories.java:126) at org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelBuilder.filter(HiveRelBuilder.java:99) at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1055) at org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:81) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] New committer: Vineet Garg
Congrats Vineet, well deserved! -Jesús On Sun, Apr 26, 2020 at 3:09 AM Leonard Xu wrote: > Congratulations, Vineet! > > Best, > Leonard Xu > > 在 2020年4月26日,18:07,xu 写道: > > > > Congrats, Vineet! > > > > Danny Chan 于2020年4月26日周日 下午4:52写道: > > > >> Congrats, Vineet! > >> > >> Best, > >> Danny Chan > >> 在 2020年4月26日 +0800 PM1:55,dev@calcite.apache.org,写道: > >>> > >>> Congrats, Vineet! > >> > > > > > > -- > > > > Best regards, > > > > Xu > >
[jira] [Created] (CALCITE-3908) JoinCommuteRule does not update all input references in condition
Jesus Camacho Rodriguez created CALCITE-3908: Summary: JoinCommuteRule does not update all input references in condition Key: CALCITE-3908 URL: https://issues.apache.org/jira/browse/CALCITE-3908 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez {{JoinCommuteRule}} swaps the inputs of a join. It relies on an internal class {{VariableReplacer}} to update the references in the join condition. However, this class does not implement {{RexShuttle}} and ends up ignoring some of the references, e.g., those in {{RexFieldAccess}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3898) RelOptPredicateList may generate incorrect map of constant values
Jesus Camacho Rodriguez created CALCITE-3898: Summary: RelOptPredicateList may generate incorrect map of constant values Key: CALCITE-3898 URL: https://issues.apache.org/jira/browse/CALCITE-3898 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez The method relies on {{RexUtil.predicateConstants}} which in turn calls {{RexUtil.canAssignFrom}}. {{RexUtil.canAssignFrom}} is skipping any check on precision and scale. I observed the error in Hive when two VARCHAR types with different precision were given to the method, which was resulting on considering the result of the narrowing cast as the value of the reference. This lead to incorrect results. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Code freeze policy during release
Makes sense; I will modify the commit messages in those JIRAs accordingly once they are pushed again. -Jesús On Thu, Feb 27, 2020 at 1:38 PM Julian Hyde wrote: > I think the best course is that the release manager (Danny) rebases master > after the RC has been accepted, just before he re-openes the master branch. > The SHAs of your commits will change, but this is unavoidable. > > > On Feb 27, 2020, at 10:58 AM, Jesus Camacho Rodriguez > wrote: > > > > I had not been very active for some time and I did not see this message > in > > the mailing list. > > > > I am responsible for pushing some of those commits while the release is > > going on; I am sorry about that. Iirc we could force commit on master? I > > can fix it if you think it's the right thing to do. > > > > Thanks, > > Jesús > > > > On Wed, Feb 26, 2020 at 10:50 AM Michael Mior wrote: > > > >> I don't believe we have ever articulated an "official" policy on this. > >> But yes, it's generally expected that once the process of preparing a > >> release has started, no one will commit to master without checking > >> with the release manager. It's up to this person to judge whether a > >> commit is important/safe enough to include into the release. I haven't > >> checked who authored these commits, but I'm going to assume the > >> possibility they were not aware of a release in progress. This is a > >> good reminder to all committers to keep aware of the release cycle. > >> -- > >> Michael Mior > >> mm...@apache.org > >> > >> Le mer. 26 févr. 2020 à 09:33, Stamatis Zampetakis > a > >> écrit : > >>> > >>> Hello, > >>> > >>> As far as I remember [1, 2, 3] the commits on master are suspended > during > >>> the release process. > >>> In principle, if there is a commit that should go in it should pass by > >> the > >>> release manager and its up to him to decide if he wants to include it > or > >>> not. > >>> Now if I am missing something I leave the more senior members fill in > the > >>> details. > >>> > >>> Best, > >>> Stamatis > >>> > >>> [1] > >>> > >> > https://lists.apache.org/thread.html/5cfea82c9e14d921e80ab3beb35d508996d18de1d45dd61404b21ca1%40%3Cdev.calcite.apache.org%3E > >>> [2] > >>> > >> > https://lists.apache.org/thread.html/1d2d226d501154aa909364c0f954da7f41e401e92013129d772d041c%40%3Cdev.calcite.apache.org%3E > >>> [3] > >>> > >> > https://lists.apache.org/thread.html/4402577cc2a596e0b221bbde53cac9f0d88568373d337f435199eb46%40%3Cdev.calcite.apache.org%3E > >>> > >>> On Wed, Feb 26, 2020 at 3:12 PM Chunwei Lei > >> wrote: > >>> > >>>> Thanks for pointing out this, Ruben. I also have this question. > >>>> > >>>> But in our scrum, we can merge commits to master at the moment we have > >> a > >>>> release branch. > >>>> > >>>> > >>>> Best, > >>>> Chunwei > >>>> > >>>> > >>>> On Wed, Feb 26, 2020 at 8:52 PM Ruben Q L wrote: > >>>> > >>>>> Hello everyone, > >>>>> > >>>>> as you know, we are in the middle of the release process for 1.22 > >> (btw, > >>>>> thanks Danny for your effort as release manager). > >>>>> However, if I am not mistaken I can see that some PRs have been > >> merged > >>>>> during this time, at least [1] and [2]. I am wondering if during this > >>>>> process (from the build of the first release candidate to the final > >>>>> approval of the release), we should not be in some kind of "code > >> freeze", > >>>>> where commits are not allowed, unless they are explicitly approved > >>>>> (ultimately by the release manager, I guess) in order to solve issues > >>>> with > >>>>> the release candidates (e.g. [3]). Is there any rule / guideline > >> about > >>>>> this? > >>>>> > >>>>> Best regards, > >>>>> Ruben. > >>>>> > >>>>> [1] https://issues.apache.org/jira/browse/CALCITE-3817 > >>>>> [2] https://issues.apache.org/jira/browse/CALCITE-3734 > >>>>> < > >>>>> > >>>> > >> > https://slack-redir.net/link?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FCALCITE-3734=3 > >>>>>> > >>>>> [3] https://issues.apache.org/jira/browse/CALCITE-3822 > >>>>> > >>>> > >> > >
Re: [DISCUSS] Code freeze policy during release
I had not been very active for some time and I did not see this message in the mailing list. I am responsible for pushing some of those commits while the release is going on; I am sorry about that. Iirc we could force commit on master? I can fix it if you think it's the right thing to do. Thanks, Jesús On Wed, Feb 26, 2020 at 10:50 AM Michael Mior wrote: > I don't believe we have ever articulated an "official" policy on this. > But yes, it's generally expected that once the process of preparing a > release has started, no one will commit to master without checking > with the release manager. It's up to this person to judge whether a > commit is important/safe enough to include into the release. I haven't > checked who authored these commits, but I'm going to assume the > possibility they were not aware of a release in progress. This is a > good reminder to all committers to keep aware of the release cycle. > -- > Michael Mior > mm...@apache.org > > Le mer. 26 févr. 2020 à 09:33, Stamatis Zampetakis a > écrit : > > > > Hello, > > > > As far as I remember [1, 2, 3] the commits on master are suspended during > > the release process. > > In principle, if there is a commit that should go in it should pass by > the > > release manager and its up to him to decide if he wants to include it or > > not. > > Now if I am missing something I leave the more senior members fill in the > > details. > > > > Best, > > Stamatis > > > > [1] > > > https://lists.apache.org/thread.html/5cfea82c9e14d921e80ab3beb35d508996d18de1d45dd61404b21ca1%40%3Cdev.calcite.apache.org%3E > > [2] > > > https://lists.apache.org/thread.html/1d2d226d501154aa909364c0f954da7f41e401e92013129d772d041c%40%3Cdev.calcite.apache.org%3E > > [3] > > > https://lists.apache.org/thread.html/4402577cc2a596e0b221bbde53cac9f0d88568373d337f435199eb46%40%3Cdev.calcite.apache.org%3E > > > > On Wed, Feb 26, 2020 at 3:12 PM Chunwei Lei > wrote: > > > > > Thanks for pointing out this, Ruben. I also have this question. > > > > > > But in our scrum, we can merge commits to master at the moment we have > a > > > release branch. > > > > > > > > > Best, > > > Chunwei > > > > > > > > > On Wed, Feb 26, 2020 at 8:52 PM Ruben Q L wrote: > > > > > > > Hello everyone, > > > > > > > > as you know, we are in the middle of the release process for 1.22 > (btw, > > > > thanks Danny for your effort as release manager). > > > > However, if I am not mistaken I can see that some PRs have been > merged > > > > during this time, at least [1] and [2]. I am wondering if during this > > > > process (from the build of the first release candidate to the final > > > > approval of the release), we should not be in some kind of "code > freeze", > > > > where commits are not allowed, unless they are explicitly approved > > > > (ultimately by the release manager, I guess) in order to solve issues > > > with > > > > the release candidates (e.g. [3]). Is there any rule / guideline > about > > > > this? > > > > > > > > Best regards, > > > > Ruben. > > > > > > > > [1] https://issues.apache.org/jira/browse/CALCITE-3817 > > > > [2] https://issues.apache.org/jira/browse/CALCITE-3734 > > > > < > > > > > > > > https://slack-redir.net/link?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FCALCITE-3734=3 > > > > > > > > > [3] https://issues.apache.org/jira/browse/CALCITE-3822 > > > > > > > >
[jira] [Created] (CALCITE-3825) Split AbstractMaterializedViewRule into multiple classes
Jesus Camacho Rodriguez created CALCITE-3825: Summary: Split AbstractMaterializedViewRule into multiple classes Key: CALCITE-3825 URL: https://issues.apache.org/jira/browse/CALCITE-3825 Project: Calcite Issue Type: Improvement Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez AbstractMaterializedViewRule contains a materialized view-based rewriting algorithm that has been there for multiple releases and it is used by some engines relying in Calcite, e.g., Apache Hive. The main reason to have a single file/class for the rule was to make the logic self-contained instead of spreading it between multiple files from the onset, as it was experimental and we were not sure how far the implementation would go. In retrospective, we should have refactor that code sooner rather than later, since it makes very difficult to understand and maintain logic that is already complicated enough. This issue is to split AbstractMaterializedViewRule into multiple files/classes (it already contained multiple internal classes). -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] New Calcite PMC chair: Stamatis Zampetakis
Congrats Stamatis! Well deserved! -Jesús On Thu, Dec 19, 2019 at 8:51 PM Julian Hyde wrote: > Glad to have you as the new chair, Stamatis! You have been a mature, > helpful and moderating voice in the community for quite some time. Well > deserved. > > Francis, thank you for serving as chair. Calcite became better and > stronger under your watch. > > I am delighted that we have had 5 chairs in the four years since Calcite > graduated (me, Jesus, Michael, Francis and now Stamatis). Each of the > chairs has been excellent, has contributed something different, and all > still actively involved in the community. > > Julian > > > > On Dec 19, 2019, at 1:32 AM, Igor Guzenko > wrote: > > > > Congratulations, Stamatis! > > > > On Thu, Dec 19, 2019 at 9:04 AM Alessandro Solimando < > > alessandro.solima...@gmail.com> wrote: > > > >> Congratulations, Stamatis! > >> > >> Il Gio 19 Dic 2019, 07:16 Enrico Olivelli ha > >> scritto: > >> > >>> Congratulations Stamatis! > >>> > >>> Enrico > >>> > >>> Il gio 19 dic 2019, 04:40 Rui Wang ha scritto: > >>> > Congratulations and Thanks Stamatis! > > > > -Rui > > On Wed, Dec 18, 2019 at 6:52 PM XING JIN > >>> wrote: > > > Congratulations Stamatis! > > > > -Jin > > > > Chunwei Lei 于2019年12月19日周四 上午10:33写道: > > > >> Congratulations Stamatis! > >> > >> > >> Best, > >> Chunwei > >> > >> > >> On Thu, Dec 19, 2019 at 9:36 AM Danny Chan > wrote: > >> > >>> Congratulations Stamatis! > >>> > >>> Best, > >>> Danny Chan > >>> 在 2019年12月19日 +0800 AM7:37,dev@calcite.apache.org,写道: > > Congratulations Stamatis! > >>> > >> > > > > >>> > >> > >
Re: Re: [DISCUSS] New Apache Calcite Chair
+1 On Mon, Nov 25, 2019 at 12:29 PM Haisheng Yuan wrote: > Thank you for your service as a chair, Francis, I really appreciate it. > > +1 for Stamatis, who has been instrumental in issue discussions, answering > questions and making Calcite a more friendly community together with other > contributors. Looking forward to working with you more closely. > > - Haisheng > > -- > 发件人:Rui Wang > 日 期:2019年11月26日 04:02:25 > 收件人: > 主 题:Re: [DISCUSS] New Apache Calcite Chair > > +1! > > Thanks for Francis's contribution as PMC chair in the past year. Looking > forward to Stamatisto to be PMC chair in the next year! > > > On Sun, Nov 24, 2019 at 6:57 PM Chunwei Lei > wrote: > > > Francis, thank you for your great job. I really respect it. > > > > I am +1 for Stamatis being the PMC chair. He is such a humble, > > energetic person. > > I believe he can bring more excellent ideas about our project. > > > > > > > > > > > > Best, > > Chunwei > > > > > > On Mon, Nov 25, 2019 at 9:26 AM Danny Chan wrote: > > > > > Thanks for the great work, Francis, you are a so nice person with mild > > > personality. Generally you keep the project in pretty good shape ~ > > > > > > I’m + 1 for Stamatis being the PMC chair, he is always knowledgeable, > > also > > > he made some bylaws for our project. > > > > > > Best, > > > Danny Chan > > > 在 2019年11月25日 +0800 AM9:13,Julian Hyde ,写道: > > > > First of all, thank you, Francis, for your year of service as chair. > > The > > > project is busier than ever, and also more welcoming than ever to new > > > members. Your personality has set the tone for the project, and your > > energy > > > has driven some much-needed modernization. > > > > > > > > I believe Stamatis would be an excellent choice, and that he should > be > > > the candidate on the ballot. > > > > > > > > I want to call out Danny Chan and Haisheng Yuan. They have been very > > > active and helpful in the project over the past few months, have become > > PMC > > > members, and maybe we should be consider them for chairs in future > years. > > > > > > > > Julian > > > > > > > > > > > > > On Nov 21, 2019, at 1:48 PM, Francis Chuang < > > francischu...@apache.org> > > > wrote: > > > > > > > > > > Hey everyone, > > > > > > > > > > It's Calcite's tradition to rotate the chair every year. When we > had > > > the State of the Project discussion [1] a month ago, there was some > > > consensus towards nominating Stamatis Zampetakis as our next chair. > > > > > > > > > > Stamatis has been a prolific contributor to the project and I > believe > > > he would be a very good choice to chair the project in the coming year. > > > > > > > > > > Please let us know your thoughts! > > > > > > > > > > Francis > > > > > > > > > > [1] > > > > > > https://lists.apache.org/thread.html/e85dbc88546687571c5dd9fd999a19d0eae52004c3b3e5b92c67d9bb@%3Cdev.calcite.apache.org%3E > > > > > > > > > > >
Re: Apache Calcite meetup group
Thanks for adding me as co-organizer Denis. I will be happy to work with you and other members of the community to organize meetups in Bay area. -Jesús On Tue, Oct 22, 2019 at 4:53 AM Alessandro Solimando < alessandro.solima...@gmail.com> wrote: > Hello, > despite not being able to contribute actively I keep following the ML and > the project, I'd love to attend meetups here in Paris if any! > > Best regards, > Alessandro > > On Tue, 22 Oct 2019 at 05:08, Julian Hyde wrote: > > > Best thing would be for you and Jesus to collaborate. Feel free to do it > > on this list, or back-channel. > > > > I don’t think it matters who is formally the organizer as long as someone > > is organizing events. > > > > > On Oct 21, 2019, at 7:58 PM, Denis Magda wrote: > > > > > > Julian, > > > > > > Exactly, any meetup group member can take over a group after > organizer(s) > > > step down. I decided to take control after receiving a second > > notification > > > threatening to shut the group down in the nearest time. Came across > this > > > discussion afterwards. > > > > > > As of now, I don’t mind to step down as a co-organizer or can > collaborate > > > with Jesus on potential local meetups. Whatever Calcite PMC prefers ;) > > > > > > Denis > > > > > > > > > On Monday, October 21, 2019, Julian Hyde wrote: > > > > > >> Thank you, Denis! > > >> > > >> I notice that it is fairly easy for someone to take control over a > > meetup > > >> group, so let me say something about governance. To be clear, I have > no > > >> indications that anyone has anything but the best intentions for this > > >> group, so this is just me worrying about a hypothetical worst case. > The > > >> group’s name and description includes the trademark “Apache Calcite” > and > > >> therefore we, the PMC, have the right of veto over what the meetup > group > > >> does. Or request that it changes its name. > > >> > > >> That would give us some control over this group, if it should ever > come > > to > > >> that. To repeat, I have no doubt that everyone is acting in good > faith. > > >> > > >> Julian > > >> > > >> > > >>> On Oct 21, 2019, at 4:09 PM, Denis Magda wrote: > > >>> > > >>> Hi Jesús, > > >>> > > >>> I've paid for the next 6 months and made you a co-organizer. Let's > work > > >>> together and arrange a local meetup in the Bay Area. > > >>> > > >>> - > > >>> Denis > > >>> > > >>> > > >>>> > > >>>> Am 18.10.19, 19:53 schrieb "Jesus Camacho Rodriguez" < > > >> jcama...@apache.org > > >>>>> : > > >>>> > > >>>> It seems someone else (Denis Magda) paid the fees in the meantime. > > >>>> > > >>>> -Jesús > > >>>> > > >>>> On Fri, Oct 18, 2019 at 1:32 AM Danny Chan > > >>>> wrote: > > >>>> > > >>>>> Thanks Jesús for taking over this ! > > >>>>> > > >>>>> Best, > > >>>>> Danny Chan > > >>>>> 在 2019年10月18日 +0800 PM2:00,dev@calcite.apache.org,写道: > > >>>>>> > > >>>>>> Jesús > > >>>>> > > >>>> > > >>>> > > >>>> > > >> > > >> > > > > > > -- > > > - > > > Denis > > > > >
Re: Apache Calcite meetup group
It seems someone else (Denis Magda) paid the fees in the meantime. -Jesús On Fri, Oct 18, 2019 at 1:32 AM Danny Chan wrote: > Thanks Jesús for taking over this ! > > Best, > Danny Chan > 在 2019年10月18日 +0800 PM2:00,dev@calcite.apache.org,写道: > > > > Jesús >
Re: Apache Calcite meetup group
Absolutely! Thanks, Jesús On Thu, Oct 17, 2019 at 3:46 PM Julian Hyde wrote: > I would be delighted if you would do that - thank you! > > If you are organizing a meet up, please consult with this list. There may > be people who would be willing to speak and have something interesting to > say. > > Julian > > > On Oct 16, 2019, at 3:14 PM, Jesus Camacho Rodriguez < > jcama...@apache.org> wrote: > > > > Hi Julian, > > > > I have just seen your message. Although we have other ways to > communicate, > > I believe it may be valuable to keep the group even if a meetup has not > > happened for a while (we may organize some meetups in the future, those > > interested in Calcite may be subscribed to group to attend talks around > the > > project even if they do not follow the project closely through mailing > > list, etc.). I would be happy to pay the fees to keep it. I have just > > checked and I think I could simply go ahead and pay, but let me know if I > > need to do anything else. > > > > Thanks, > > Jesús > > > > > > On Wed, Oct 16, 2019 at 11:21 AM Julian Hyde wrote: > > > >> If you’re a member of the Apache Calcite meetup group[1], you probably > >> just received an email saying that the group is shutting down. I set it > up > >> a few years ago, but I never find time to organize meetups, so I > decided to > >> stop paying the annual fee to meetup.com <http://meetup.com/>. > >> > >> I’m not particularly sad that it’s closing down, given that it has been > >> inactive, and we as a community seem to find other ways to talk to each > >> other. But if someone in the community would like to organize some > meetups > >> and is prepared to pay the fees, I’m happy to hand over the reins. > >> > >> Julian > >> > >> [1] https://www.meetup.com/Apache-Calcite/ < > >> https://www.meetup.com/Apache-Calcite/> > >
[jira] [Created] (CALCITE-3189) Multiple fixes for Oracle SQL dialect
Jesus Camacho Rodriguez created CALCITE-3189: Summary: Multiple fixes for Oracle SQL dialect Key: CALCITE-3189 URL: https://issues.apache.org/jira/browse/CALCITE-3189 Project: Calcite Issue Type: Bug Affects Versions: 1.20.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Among others, it includes i) SQL translation support for custom types (e.g. {{SMALLINT}} --> {{NUMBER(5)}}), ii) limiting max length of {{VARCHAR}} type, iii) creating datetime literals correctly, and iv) method to infer whether a given data type is supported or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-3066) RelToSqlConverter may incorrectly throw an AssertionError for some decimal literals
Jesus Camacho Rodriguez created CALCITE-3066: Summary: RelToSqlConverter may incorrectly throw an AssertionError for some decimal literals Key: CALCITE-3066 URL: https://issues.apache.org/jira/browse/CALCITE-3066 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Issue can be reproduced adding the following query to {{RelToSqlConverterTest}}: {code:sql} select -0.000123 from "expense_fact"; {code} {code} Caused by: java.lang.AssertionError: -1.23E-8 at org.apache.calcite.sql.SqlLiteral.createExactNumeric(SqlLiteral.java:872) at org.apache.calcite.rel.rel2sql.SqlImplementor$Context.toSql(SqlImplementor.java:502) at org.apache.calcite.rel.rel2sql.RelToSqlConverter.visit(RelToSqlConverter.java:186) ... 34 more {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: ClassCastException RelOptCostImpl VolcanoCost while using HepPlanner
bq . Since we are talking about materialized views, I think in most cases tableRel should be simply a LogicalTableScan. Stamatis is correct about this, I had not realized tableRel == queryRel in your sample code. Thanks, Jesús On Fri, May 3, 2019 at 6:12 AM Stamatis Zampetakis wrote: > I think the main problem comes from the fact that tableRel == queryRel in > the test case you provided. > Defining the materialized view like that basically says that when you find > a part of the query that satisfies queryRel replace it with itself. > In conjunction with the rule that is used, which allows partial rewritings > using union, you end up with a rule that matches infinite number of times. > Since we are talking about materialized views, I think in most cases > tableRel should be simply a LogicalTableScan. > The idea is that expression represented by queryRel is materialized into a > table so in order to retrieve the results we only need to scan the table. > > Regarding the "if (true)" statements that you observed, most likely they > were introduced as release toggles [1]. > However, since the last commit was in 2013 I think by now it is safe to > refactor that part and remove dead code. > > [1] https://www.martinfowler.com/articles/feature-toggles.html > > Best, > Stamatis > > On Fri, May 3, 2019 at 12:50 PM Mark Pasterkamp < > markpasterkamp1...@gmail.com> wrote: > > > Dear Jesus, > > > > I think your intuition in this regard is correct. > > After executing the main program in the HepPlanner the resulting plan > > contains a lot of circular references. > > Changing the matching order does not influence this behaviour. > > > > > > Mark > > > > On Thu, 2 May 2019 at 22:14, Jesus Camacho Rodriguez > > wrote: > > > > > Mark, > > > > > > I have an intuition that this happens because the rule creates a > > partially > > > contained rewriting with a union, where one side contains a scan over > the > > > materialized view and the other side contains the query itself with a > > > filter on top excluding the data that is coming from the materialized > > view. > > > Then the rule is triggered on the plan representing the original query > > > again and the process is repeated. Have you tried changing the matching > > > order for your hep program? > > > > > > Thanks, > > > Jesús > > > > > > On Thu, May 2, 2019 at 8:53 AM Mark Pasterkamp < > > > markpasterkamp1...@gmail.com> > > > wrote: > > > > > > > Hi Stamatis, > > > > > > > > I have tried to recreate the issue but I have not been able to do > > that. I > > > > was however able to create a new exception which I don't quite > > > understand. > > > > The error happened when calcite was creating a union rewriting using > > > > materialized views. But trying to recreate this situation gave me > > another > > > > interesting one. > > > > This time, the planner rewrites one of the children nodes into > itself I > > > > would assume which causes a stack overflow. The method itself can be > > > found > > > > here: > > > > > > > > > > > > > > > > > > https://github.com/mpasterkamp/calcite/blob/768b7928dbde5f6f9775a1119e7466d8eafafb4b/core/src/test/java/org/apache/calcite/test/HepPlannerTest.java#L312 > > > > > > > > Perhaps I am doing something wrong, perhaps not? I am not > knowledgeable > > > > enough about this to understand why this is happening. Wish I could > > help > > > > more for that. > > > > > > > > Also, while investigating this issue I found another interesting > > artifact > > > > in de source code of the VolcanoCost. A lot of methods in this class > > have > > > > an "if (true)"-statement like here: > > > > > > > > > > > > > > > > > > https://github.com/apache/calcite/blob/4b4d8037c5073e4eb5702b12bc4ecade31476616/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoCost.java#L100 > > > > > > > > Now I was just curious, is there any reason for this to be there that > > you > > > > know of? > > > > > > > > Thank you for responding and congratulations for your recent > > promotions. > > > > > > > > > > > > With kind regards, > > > > > > > > Mark > > > > > > > > > > > > On T
Re: ClassCastException RelOptCostImpl VolcanoCost while using HepPlanner
Mark, I have an intuition that this happens because the rule creates a partially contained rewriting with a union, where one side contains a scan over the materialized view and the other side contains the query itself with a filter on top excluding the data that is coming from the materialized view. Then the rule is triggered on the plan representing the original query again and the process is repeated. Have you tried changing the matching order for your hep program? Thanks, Jesús On Thu, May 2, 2019 at 8:53 AM Mark Pasterkamp wrote: > Hi Stamatis, > > I have tried to recreate the issue but I have not been able to do that. I > was however able to create a new exception which I don't quite understand. > The error happened when calcite was creating a union rewriting using > materialized views. But trying to recreate this situation gave me another > interesting one. > This time, the planner rewrites one of the children nodes into itself I > would assume which causes a stack overflow. The method itself can be found > here: > > > https://github.com/mpasterkamp/calcite/blob/768b7928dbde5f6f9775a1119e7466d8eafafb4b/core/src/test/java/org/apache/calcite/test/HepPlannerTest.java#L312 > > Perhaps I am doing something wrong, perhaps not? I am not knowledgeable > enough about this to understand why this is happening. Wish I could help > more for that. > > Also, while investigating this issue I found another interesting artifact > in de source code of the VolcanoCost. A lot of methods in this class have > an "if (true)"-statement like here: > > > https://github.com/apache/calcite/blob/4b4d8037c5073e4eb5702b12bc4ecade31476616/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoCost.java#L100 > > Now I was just curious, is there any reason for this to be there that you > know of? > > Thank you for responding and congratulations for your recent promotions. > > > With kind regards, > > Mark > > > On Thu, 2 May 2019 at 14:58, Stamatis Zampetakis > wrote: > > > Said like that it looks like a bug. > > > > I think the best would be to reproduce the exception as a unit test in > > HepPlannerTest [1], RelOptRulesTest [2], or PlannerTest [3] so that we > > could understand better the use case. > > > > [1] > > > > > https://github.com/apache/calcite/blob/master/core/src/test/java/org/apache/calcite/test/HepPlannerTest.java > > [2] > > > > > https://github.com/apache/calcite/blob/master/core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java > > [3] > > > > > https://github.com/apache/calcite/blob/master/core/src/test/java/org/apache/calcite/tools/PlannerTest.java > > > > On Thu, May 2, 2019 at 7:56 AM Mark Pasterkamp < > > markpasterkamp1...@gmail.com> > > wrote: > > > > > I don't, I would assume that the HepPlanner.findBestExp() calculates > the > > > cost somewhere down the line > > > > > > On Thu, May 2, 2019, 03:31 Yuzhao Chen wrote: > > > > > > > Why you care about cost when use HepPlanner ? The HepPlanner is aimed > > for > > > > some deterministic planning rules, we usually do not need cost in > Hep. > > > Some > > > > exceptions like Join reorder may need a cost. > > > > > > > > What kind of planning promotion you did ? I'm kind of curious about > it. > > > > > > > > Best, > > > > Danny Chan > > > > 在 2019年5月1日 +0800 PM9:27,Mark Pasterkamp < > markpasterkamp1...@gmail.com > > > > >,写道: > > > > > Dear all, > > > > > > > > > > While playing around with the HepPlanner I ran into an issue where > > the > > > > > planner wants to rewrite a query with a union rewrite. When the > > > > > RelMetaDataQuery computes the cost, the cost instance is a > > VolcanoCost. > > > > > Then when it tries to calculate the cost of one of the union's > > operands > > > > it > > > > > is a RelCostImpl which results in the ClassCastException. > > > > > > > > > > How would I go about solving this issue? As far as my knowledge > > goes, I > > > > am > > > > > not able to change the costhandler of the RelMetaDataQuery. Another > > > > > approach I could see is removing the cast in the VolcanoCost class, > > > but I > > > > > would hope I do not have to do that. > > > > > > > > > > > > > > > With kind regards, > > > > > > > > > > Mark > > > > > > > > > >
[jira] [Created] (CALCITE-2976) Improve materialized view rewriting coverage with disjunctive predicates
Jesus Camacho Rodriguez created CALCITE-2976: Summary: Improve materialized view rewriting coverage with disjunctive predicates Key: CALCITE-2976 URL: https://issues.apache.org/jira/browse/CALCITE-2976 Project: Calcite Issue Type: Improvement Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez For instance, in the following case: {code} @Test public void testJoinAggregateMaterializationAggregateFuncs14() { checkMaterialize( "select \"empid\", \"emps\".\"name\", \"emps\".\"deptno\", \"depts\".\"name\", " + "count(*) as c, sum(\"empid\") as s\n" + "from \"emps\" join \"depts\" using (\"deptno\")\n" + "where (\"depts\".\"name\" is not null and \"emps\".\"name\" = 'a') or " + "(\"depts\".\"name\" is not null and \"emps\".\"name\" = 'b')\n" + "group by \"empid\", \"emps\".\"name\", \"depts\".\"name\", \"emps\".\"deptno\"", "select \"depts\".\"deptno\", sum(\"empid\") as s\n" + "from \"emps\" join \"depts\" using (\"deptno\")\n" + "where \"depts\".\"name\" is not null and \"emps\".\"name\" = 'a'\n" + "group by \"depts\".\"deptno\"", HR_FKUK_MODEL, CONTAINS_M0); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2951) Support decorrelate subquery that has aggregate with grouping sets
Jesus Camacho Rodriguez created CALCITE-2951: Summary: Support decorrelate subquery that has aggregate with grouping sets Key: CALCITE-2951 URL: https://issues.apache.org/jira/browse/CALCITE-2951 Project: Calcite Issue Type: Sub-task Reporter: Jesus Camacho Rodriguez Assignee: Haisheng Yuan Fix For: 1.20.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [VOTE] Release apache-calcite-1.19.0 (release candidate 2)
+1 (binding) - Checked signatures - Built and ran test suite - Ran a few Hive tests Thanks, Jesús On 3/22/19, 4:53 PM, "Stamatis Zampetakis" wrote: System: MacOS 10.13.6, jdk9, maven 3.5.2 -checked signatures and checksums OK -went quickly over release note OK -run unit tests (mvn clean install) on staged sources and git repo OK -run integration tests (mvn -Dtest=foo -DfailIfNoTests=false -Pit verify -fn) KO A brief summary of the errors is given below: [ERROR] Tests run: 290, Failures: 1, Errors: 0, Skipped: 21, Time elapsed: 23.703 s <<< FAILURE! - in org.apache.calcite.test.JdbcTest (MySQL) [ERROR] Tests run: 290, Failures: 0, Errors: 1, Skipped: 21, Time elapsed: 34.468 s <<< FAILURE! - in org.apache.calcite.test.JdbcTest (Postgres) [ERROR] Tests run: 36, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.478 s <<< FAILURE! - in org.apache.calcite.test.JdbcAdapterTest (Postgres) -run slow tests (mvn clean install -Dcalcite.test.slow) KO I tried a few times but the tests never finished. -run tests on downstream project OK I don't think any of the above problems should block the release so my vote is: +1 (non-binding) Thanks for pushing this forward Kevin! Στις Παρ, 22 Μαρ 2019 στις 5:15 μ.μ., ο/η Andrei Sereda έγραψε: > +1 (non-binding). > > Thank you for your hard work, Kevin. > > Environment: MacOS. Maven 3.6.0. jdk1.8.0_181. > > mvn clean install - OK > ./mvnw clean install - OK > sh256 check - OK > GPG check - OK > Release notes - OK > > > > On Thu, Mar 21, 2019 at 5:20 PM Julian Hyde wrote: > > > +1 (binding) > > > > Checked hashes, LICENSE, NOTICE, README, HISTORY.md, built and ran tests > > using Ubuntu JDK 11, using both mvn and mvnw, ran rat. > > > > Thanks Kevin! > > > > Julian > > > > > > > On Mar 21, 2019, at 5:05 AM, Enrico Olivelli > > wrote: > > > > > > great work Kevin > > > > > > +1 (non binding) > > > checked signatures/checksum > > > built from source on JDK11 (AdoptOpenJDK 11.0.2) on Linux, all tests > > passed. > > > > > > all tests of downstream project HerdDB passed, 100% compatible, > > > upgrading from 1.17.0 > > > > > > Regards > > > Enrico > > > > > > > > > Il giorno mer 20 mar 2019 alle ore 22:58 Francis Chuang > > > ha scritto: > > >> > > >> +1 (binding) > > >> > > >> Environment: maven:latest docker image (Maven 3.6.0, OpenJDK 11.0.2, > > >> Debian stretch). > > >> > > >> Verified SHA256 hash - OK > > >> Verified GPG signature - OK > > >> Ran tests using ./mvnw -DskipTests clean install and ./mvnw test - OK > > >> Checked history.md - OK > > >> > > >> Francis > > >> > > >> On 21/03/2019 5:05 am, Kevin Risden wrote: > > >>> Hi all, > > >>> > > >>> I have created a build for Apache Calcite 1.19.0, release candidate > 2. > > >>> > > >>> Thanks to everyone who has contributed to this release. > > >>> > > >>> Since RC 1, we have fixed the following issues: > > >>> * [CALCITE-2929] Simplification of IS NULL checks are incorrectly > > assuming > > >>> that CAST-s are > > >>> possible > > >>> * [CALCITE-2931] Mongo Adapter- Compare Bson (not string) query > > >>> representation in tests > > >>> * [CALCITE-2932] Update stale Druid integration test cases > > >>> > > >>> You can read the release notes here: > > >>> > > https://github.com/apache/calcite/blob/branch-1.19/site/_docs/history.md > > >>> > > >>> The commit to be voted upon: > > >>> > > > https://gitbox.apache.org/repos/asf?p=calcite.git;a=commit;h=4143176acdb2860b3a80eb18e4cb1557f5969d13 > > >>> > > >>> Its hash is 4143176acdb2860b3a80eb18e4cb1557f5969d13. > > >>> > > >>> The artifacts to be voted on are located here: > > >>> > > > https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.19.0-rc2/ > > >>> > > >>> The hashes of the artifacts are as follows: > > >>> src.tar.gz.sha256 > > >>> 833fa3e9c97d8e89443f3b7a62fc6cce5f0e734836d82db1443425be6ac5dc65 > > >>> > > >>> A staged Maven repository is available for review at: > > >>> > > > https://repository.apache.org/content/repositories/orgapachecalcite-1057/ > > >>> > > >>> Release artifacts are signed with the following key: > > >>> https://people.apache.org/keys/committer/krisden.asc > > >>> > > >>> Please vote on releasing this package as Apache Calcite 1.19.0. > > >>> > > >>> The vote is open for the next 72 hours and passes if a majority of > > >>> at least three +1 PMC votes are cast. > > >>> > > >>> [ ] +1 Release this package as Apache Calcite 1.19.0 > > >>> [ ] 0 I don't feel strongly about it, but
[jira] [Created] (CALCITE-2943) Materialized view rewriting logic calls getApplicableMaterializations each time the rule is triggered
Jesus Camacho Rodriguez created CALCITE-2943: Summary: Materialized view rewriting logic calls getApplicableMaterializations each time the rule is triggered Key: CALCITE-2943 URL: https://issues.apache.org/jira/browse/CALCITE-2943 Project: Calcite Issue Type: Improvement Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: Screen Shot 2019-03-21 at 2.33.01 PM.png {{RelOptMaterializations.getApplicableMaterializations}} is called each time the rule is triggered. {code:java} ... // Obtain applicable (filtered) materializations // TODO: Filtering of relevant materializations needs to be // improved so we gather only materializations that might // actually generate a valid rewriting. final List applicableMaterializations = RelOptMaterializations.getApplicableMaterializations(node, materializations); ... {code} When I implemented the rule, I assumed (incorrectly) that {{getApplicableMaterializations}} was a lightweight call and hence would help discarding materialized views extracted from the planner quickly. It turns out that the method can quickly become the most time consuming part of the rule execution; I assume the method was just supposed to be used once per query. !Screen Shot 2019-03-21 at 2.33.01 PM.png! Since the prefiltering that we do right now is rather simple and we already extract the tables used by queries and materializations within the rule, we can just skip the materialization over there. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2942) Materialized view rewriting logic instantiates RelMetadataQuery each time the rule is triggered
Jesus Camacho Rodriguez created CALCITE-2942: Summary: Materialized view rewriting logic instantiates RelMetadataQuery each time the rule is triggered Key: CALCITE-2942 URL: https://issues.apache.org/jira/browse/CALCITE-2942 Project: Calcite Issue Type: Improvement Reporter: Jesus Camacho Rodriguez Performance penalty is similar to the one described in CALCITE-1812. An instance may be available in the cluster, hence we can use it; this is just an addendum to CALCITE-1812. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2883) HepPlanner subprogram may loop till getting out of memory
Jesus Camacho Rodriguez created CALCITE-2883: Summary: HepPlanner subprogram may loop till getting out of memory Key: CALCITE-2883 URL: https://issues.apache.org/jira/browse/CALCITE-2883 Project: Calcite Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Consider the following two hep programs. Program 1: {code} final HepProgramBuilder programBuilder = new HepProgramBuilder(); programBuilder.addMatchOrder(HepMatchOrder.BOTTOM_UP); programBuilder.addRuleInstance(JoinToMultiJoinRule.INSTANCE); programBuilder.addRuleInstance(LoptOptimizeJoinRule.INSTANCE); final HepProgram program = programBuilder.build(); {code} Program 2: {code} final HepProgramBuilder programBuilder = new HepProgramBuilder(); final HepProgramBuilder subprogramBuilder = new HepProgramBuilder(); subprogramBuilder.addMatchOrder(HepMatchOrder.BOTTOM_UP); subprogramBuilder.addRuleInstance(JoinToMultiJoinRule.INSTANCE); subprogramBuilder.addRuleInstance(LoptOptimizeJoinRule.INSTANCE); programBuilder.addSubprogram(subprogramBuilder.build()); final HepProgram program = programBuilder.build(); {code} I would expect both programs to behave similarly. However, program 2 will loop indefinitely. The reason is that {{HepPlanner}} subprogram execution loops if subprogram generates any new expression. https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/hep/HepPlanner.java#L339 This does not seem right since planner can control exiting the program (and thus, subprogram) depending on its own internal state and configuration properties, e.g., match limit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2858) JSON writer and reader should include type information for literals
Jesus Camacho Rodriguez created CALCITE-2858: Summary: JSON writer and reader should include type information for literals Key: CALCITE-2858 URL: https://issues.apache.org/jira/browse/CALCITE-2858 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.19.0 JSON writer does not include information about literals type (unless the literal is null). This can lead to some ambiguity when parsing the plan back into RelNodes by the JSON reader. This issue is to include this information in the writer, and to be able to parse it back in the reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[ANNOUNCE] New committer: Zoltan Haindrich
Apache Calcite's Project Management Committee (PMC) has invited Zoltan Haindrich to become a committer, and we are pleased to announce that he has accepted. Over the past few months, Zoltan has contributed many improvements and fixes to core parts of the project related to query optimization. Zoltan, welcome, thank you for your contributions, and we look forward your further interactions with the community! If you wish, please feel free to tell us more about yourself and what you are working on. Jesús (on behalf of the Apache Calcite PMC)
[ANNOUNCE] New committer: Stamatis Zampetakis
Apache Calcite's Project Management Committee (PMC) has invited Stamatis Zampetakis to become a committer, and we are pleased to announce that he has accepted. Over the past few months, Stamatis has made several contributions to Calcite and he is a very active participant in discussions in issues and mailing lists. Stamatis, welcome, thank you for your contributions, and we look forward your further interactions with the community! If you wish, please feel free to tell us more about yourself and what you are working on. Jesús (on behalf of the Apache Calcite PMC)
Re: [VOTE] Release apache-calcite-1.18.0 (release candidate 2)
+1 (binding) - Checked signature and hashes - Built and ran test suite On 12/19/18, 11:52 PM, "Andrei Sereda" wrote: +1 (non-binding) - Build and ran tests from 6bca0b808 on MacOS X Oracle JDK 8, 9 and 10 - Checked signatures and checksums - Release notes look good On Wed, Dec 19, 2018 at 5:32 PM Francis Chuang wrote: > +1 (binding) > > Env: Debian 9 (stretch), Maven 3.6.0, OpenJDK 1.8.0_181 running in > docker container > > - Verified SHA256 - OK > - Verified signature - OK > - Checked NOTICE - OK > - Checked README - OK > - Ran mvn -DskipTests clean install - OK > - Ran mvn test - OK > > On 20/12/2018 5:42 am, Sergey Nuyanzin wrote: > > +1 (non-binding) > > > > checked hashes, signatures > > built and ran tests with jdk 8 on Windows 10 and Fedora 29 > > > > On Wed, Dec 19, 2018 at 7:33 PM Volodymyr Vysotskyi < > volody...@apache.org> > > wrote: > > > >> +1 (binding) > >> > >> Checked hash and signature for sources tar, unpacked and run tests, all > >> tests passed. > >> Verified signature for jars. Checked LICENSE and NOTICE. > >> > >> Thanks Julian for making the release! > >> > >> Kind regards, > >> Volodymyr Vysotskyi > >> > >> > >> On Wed, Dec 19, 2018 at 4:55 PM Kevin Risden > wrote: > >> > >>> +1 (binding) > >>> > >>> Checked rat, NOTICE, LICENSE, checksums and signatures. Built and ran > >> tests > >>> with JDK 8 and 11. > >>> > >>> Kevin Risden > >>> > >>> > >>> On Wed, Dec 19, 2018 at 8:24 AM Enrico Olivelli > >>> wrote: > >>> > +1 (non binding) > built from source artifacts, run tests, all passed on Linux fedora 28 > run tests of HerdDB, all is ok > checked signatures and checksums, all is ok > > great work > Enrico > > Il giorno mer 19 dic 2018 alle ore 13:32 Stamatis Zampetakis > ha scritto: > > +1 (non binding) > > > > System: Ubuntu 18.04 LTS and jdk1.8.0.66 > > > > -run mvn clean install on staged sources and git repo > > -checked signatures and checksums > > -went quickly over release note > > -run tests on downstream project (no regressions detected) > > > > Best, > > Stamatis > > > > Στις Τρί, 18 Δεκ 2018 στις 9:32 μ.μ., ο/η Michael Mior < > >>> mm...@apache.org > > έγραψε: > > > >> +1 (binding) Checked signature and hashes and compiled and ran > >> tests. > >> Thanks Julian! > >> > >> -- > >> Michael Mior > >> mm...@apache.org > >> > >> > >> Le mar. 18 déc. 2018 à 15:00, Julian Hyde a > >>> écrit : > >>> Hi all, > >>> > >>> I have created a build for Apache Calcite 1.18.0, release > >> candidate > 2. > >>> Since the previous release candidate (RC1), issues > >>> https://issues.apache.org/jira/browse/CALCITE-2730 and > >>> https://issues.apache.org/jira/browse/CALCITE-2731 have > >>> been fixed, and Zoltan indicates that the Hive test suite now > >>> passes. > >>> Thanks to everyone who has contributed to this release. > >>> You can read the release notes here: > >>> > >> > https://github.com/apache/calcite/blob/branch-1.18/site/_docs/history.md > >>> The commit to be voted upon: > >>> > >>> > >> > http://git-wip-us.apache.org/repos/asf/calcite/commit/6bca0b80859e86ac41ba1e342460db929cf72268 > >>> Its hash is 6bca0b80859e86ac41ba1e342460db929cf72268. > >>> > >>> The artifacts to be voted on are located here: > >>> > >> > https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.18.0-rc2 > >>> The hashes of the artifacts are as follows: > >>> src.tar.gz.sha256 > >>> a04bfb1bac830e57475dffdbf5f0c3a797c358460d240f6203c929bac61b47ae > >>> > >>> A staged Maven repository is available for review at: > >>> > >> > https://repository.apache.org/content/repositories/orgapachecalcite-1051 > >>> Release artifacts are signed with the following key: > >>> https://people.apache.org/keys/committer/jhyde.asc > >>> > >>> Please vote on releasing this package as Apache Calcite 1.18.0. > >>> > >>> The vote is open for the next 72 hours (until 12 noon Pacific on > Friday) > >>> and passes if a majority of at least three +1 PMC votes are cast. > >>> > >>> [ ] +1 Release this package as Apache Calcite 1.18.0 > >>> [ ] 0 I don't feel strongly about it, but I'm okay with the > >>> release > >>> [ ] -1 Do not release this package because...
[jira] [Created] (CALCITE-2740) Add tests for SqlFunction.supportsFunction method
Jesus Camacho Rodriguez created CALCITE-2740: Summary: Add tests for SqlFunction.supportsFunction method Key: CALCITE-2740 URL: https://issues.apache.org/jira/browse/CALCITE-2740 Project: Calcite Issue Type: Test Components: jdbc-adapter Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Method is used in Hive, but it is not used by Calcite at all. At the very least, we should add some unit tests to check that the method is working as expected. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2733) Use catalog and schema from JDBC connection string to retrieve tables if specified
Jesus Camacho Rodriguez created CALCITE-2733: Summary: Use catalog and schema from JDBC connection string to retrieve tables if specified Key: CALCITE-2733 URL: https://issues.apache.org/jira/browse/CALCITE-2733 Project: Calcite Issue Type: Improvement Components: jdbc-adapter Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez >From JDBC 4.1, catalog and schema can be retrieved from the connection object. >When we retrieve the table objects using the JDBC connection, I believe we >could try to get catalog and schema from connection object if they have not >been specified by user. If they are not in the connection object either, null >will be passed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2732) Upgrade postgresql driver version
Jesus Camacho Rodriguez created CALCITE-2732: Summary: Upgrade postgresql driver version Key: CALCITE-2732 URL: https://issues.apache.org/jira/browse/CALCITE-2732 Project: Calcite Issue Type: Test Components: jdbc-adapter Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez We are still using {{9.3-1102-jdbc3}} version. Not sure if anyone has run the compatibility tests in calcite-test-dataset with Postgresql recently, but I get an java.lang.AbstractMethodError message for several of them. We can move to the more recent {{9.3-1104-jdbc41}} (I verified that this fixes the issue). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2713) JDBC adapter may generate casts on PostgreSQL for VARCHAR type exceeding max length
Jesus Camacho Rodriguez created CALCITE-2713: Summary: JDBC adapter may generate casts on PostgreSQL for VARCHAR type exceeding max length Key: CALCITE-2713 URL: https://issues.apache.org/jira/browse/CALCITE-2713 Project: Calcite Issue Type: Bug Components: jdbc-adapter Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Varchar length in PostgreSQL cannot exceed 10485760, however Calcite may generate a cast with length larger than that number, resulting in an exception. {noformat} org.postgresql.util.PSQLException: ERROR: length for type varchar cannot exceed 10485760 {noformat} >From {{htup_details.h}} in postgresql: {noformat} * MaxAttrSize is a somewhat arbitrary upper limit on the declared size of * data fields of char(n) and similar types. It need not have anything * directly to do with the *actual* upper limit of varlena values, which * is currently 1Gb (see TOAST structures in postgres.h). I've set it * at 10Mb which seems like a reasonable number --- tgl 8/6/00. */ {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2673) SqlDialect supports pushing of all functions by default
Jesus Camacho Rodriguez created CALCITE-2673: Summary: SqlDialect supports pushing of all functions by default Key: CALCITE-2673 URL: https://issues.apache.org/jira/browse/CALCITE-2673 Project: Calcite Issue Type: Improvement Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.18.0 SqlDialect contains a 'supportsFunction' that can be used by rules to know whether a certain function is supported in the given dialect, e.g., to choose whether to push a Filter expression to JDBC, etc. The default implementation of 'supportsFunction' always returns true. I believe a better idea would be to support in the default implementation for the method the most common SQL functions. Then each dialect can override that behavior and expand/limit the supported functions, e.g., JethroDataDialect already does that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2669) RelMdTableReferences should check whether references inferred from input are null for Union/Join operators
Jesus Camacho Rodriguez created CALCITE-2669: Summary: RelMdTableReferences should check whether references inferred from input are null for Union/Join operators Key: CALCITE-2669 URL: https://issues.apache.org/jira/browse/CALCITE-2669 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.18.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2668) Support for left/right outer join in RelMdExpressionLineage
Jesus Camacho Rodriguez created CALCITE-2668: Summary: Support for left/right outer join in RelMdExpressionLineage Key: CALCITE-2668 URL: https://issues.apache.org/jira/browse/CALCITE-2668 Project: Calcite Issue Type: Improvement Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Currently, we bail out in the metadata provider if join operator is not of inner type. For left, right outer joins, we could track expressions generated from columns from the left, right inputs, respectively. In addition, if any of the columns in the join cannot be backtracked, we could still backtrack the origin of an expression if it is not using that column, and currently we bail out too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[DISCUSS] Where do we draw the line?
Is it OK for a PMC member of this community to engage with a new contributor to the project in this way? https://github.com/apache/calcite/commit/b470a0cd4572c9f6c4c0e9b51926b97c5af58d3f#r30950660 I wanted to bring everyone´s attention to the issue because I do not believe this behavior contributes to the health of the project, welcoming new contributions, etc. The same could have been said in a very different way, and I do not think Zoltan was engaging disrespectfully. I am not sure whether I am overreacting, I would like to hear others opinion. Does anyone else in the PMC find this disturbing? Does the ASF provide clear guidelines about how members of a community should engage with each other? Thanks, Jesús
Re: Exception-handling in built-in functions
I do not believe there is enough reason to block CALCITE-525. IMO, CALCITE-525 describes a problem that some Calcite users are facing and a reasonable plugable solution. We should not be vetoing such a feature without providing viable alternatives. (Without having checked the specific implementation details, I prefer approach B described below as it is less intrusive. And A should be fixed in a different issue.) I agree with Julian´s idea that Calcite is not a RDBMS such as Oracle or Postgres, and it has always tried to provide flexibility to underlying engines, one of the reasons for its wide adoption. In addition, systems are not forced to use this feature, it is tagged as experimental and by default we are still running in same mode. I believe that is sufficient. Personally, I will not be happy if a developer feels compelled to fork Calcite or stop contributing code because we do not accept features such as the one described there. Thanks, Jesús On 10/17/18, 5:17 PM, "Michael Mior" wrote: My apologies for missing this thread a couple days ago. (Thanks for pinging it.) Here's my two cents: taking care of contributors to the project is just as important (if not more important) than taking care of the code. I'm not saying we should merge terrible code just to keep each other happy, but I don't think that's the case here. If anyone writes some code which you disagree with, you should be free to voice your disagreement. However, especially when the code is from a core contributor and the argument focuses on potential future problems, I think it's important to consider that people who have shown dedication to the project over the years are very likely to be around and willing to fix these problems as they arise. Code which turns out to cause problems can always be deleted, reverted, refactored, etc. It's much harder to back out when a contributor is burned out or interpersonal conflicts get heated. -- Michael Mior mm...@apache.org Le mer. 17 oct. 2018 à 14:58, Julian Hyde a écrit : > Vladimir, > > You’ve made your points. And I hear them. > > However I get the impression that you are not open to persuasion. Which > means that I am wasting my time trying to reach consensus with you. Which > means that people win arguments not on merit, but based upon who is most > persistent. > > Here is my point. Calcite's goal is not to re-create what Oracle or > PostgreSQL did ten years later. It is a platform that allows people to > write their own data engine. If they want to redefine the “+” operator such > that 2 + 2 returns 5, the platform should allow it. > > Certainly if they want to engineer their own error-handling strategy, we > should let them do it. I didn’t have the energy to find an example of a SQL > engine that discards rows with divide-by-zero errors, but I believe there > is one. I suspect that both Broadbase, SQLstream and Hive, three SQL > engines that I have worked on that performed ETL-like tasks, all had that > capability. And all ETL tools have very flexible error-handling strategies. > They are not SQL-based, but Calcite is not exclusively for SQL systems. > > I have been designing and building world-class data engines for 30 years. > Please take me on good faith that a flexible error-handing strategy is a > good idea. Don’t force me to bicker over email for hours and hours. When a > long discussion leads to the rejection of a contribution, I get > considerably closer to burning out. > > Julian > > > > On Oct 17, 2018, at 11:36 AM, Vladimir Sitnikov < > sitnikov.vladi...@gmail.com> wrote: > > > > Juilian>Hey, folks. We need your input here. > > > > Here are my thoughts: > > 1) I think the features we add should have at least some level of > > consistency > > 2) It is much safer to adopt well-known features rather than be pioneers > in > > the field. I do not mean we must wait for someone else implement and try > > out a feature, however I would not rush for implementing a feature that > > no-one else explored. > > > > CALCITE-525 has two key points: > > A) Current implementation of enumerable factors code like 0/0 to a static > > field of a generated code. It causes the generated code to fail at load > > time even before the query is executed. > > Of course that is a bug, and I'm even inclined to remove that "static > > fields" > > > > B) Someone (Hongze? Juilan?) suggest to implement a mode to silently > ignore > > the error (e.g. by ignoring the row or by returning default value). > > First of all, I don't think "ignore the row" kind of processing would do > > any good to the user since it would not be possible to predict the > output. > > "ignore the
Re: Query rewriting materialized view rules
Hi Mark, In principle both rewriting systems have the same objective. They can be enabled together (as in Calcite) or used independently (for instance, Hive uses AbstractMaterializedViewRule). Both rewritings mechanisms support SPJA materialized views, though MaterializedViewSubstitutionVisitor supports definitions with Union operator too, while AbstractMaterializedViewRule does not. As mentioned in the documentation, one important difference is that the MaterializedViewSubstitutionVisitor relies on other transformation rules to create equivalences between the expressions in the original plan and the materialized view definition. Adding these additional rules to the planning phase makes the rewriting powerful but expensive, as it may not scale for some queries, e.g., query and materialized view with a large number of joins combined with rules that find all the possible join permutations in a plan. In turn, the rules in AbstractMaterializedViewRule rely on structural information extracted from the subplan after a match is found, hence it does not need to enumerate exhaustively all equivalent expressions in a plan to produce a rewriting. AbstractMaterializedViewRule can also produce partial rewritings if the result of a query is partially contained in MV (including for MVs with Aggregate operator by having an additional rollup operation), which I believe MaterializedViewSubstitutionVisitor cannot do right now. (There are some examples of partial rewritings in the documentation). Depending on your specific use case, you may also be interested in lattices: http://calcite.apache.org/docs/lattice.html. -Jesús On 10/15/18, 5:06 AM, "mark pasterkamp" wrote: Hello, I have some confusion regarding the query rewriting rules in Calcite and I was hoping someone could help me with that. Looking at the documentation of materialized views http://calcite.apache.org/docs/materialized_views#materialized-views-maintained-by-calcite and in the source code, I found there are 2 systems in place for rewriting queries to use materialized views. There is the unify rules found in https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java with an extension to materialized views found in https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/MaterializedViewSubstitutionVisitor.java And there is the materialized view rules found in https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/AbstractMaterializedViewRule.java Since I am not that experienced working with Calcite, I was wondering if someone could shed some light about these 2 query rewriting systems. Are they supposed to be used independently or in conjunction? Do they try to solve the same thing or is one rewriting system better for some queries than others? (for instance, if I want a query to be rewritten to use materialized views, would that better fit the MaterializedViewSubstitutionVisitor or the AbstractMaterializedViewRule). If someone could help me out to better understand how calcite does its rewritings, that would be great.
[jira] [Created] (CALCITE-2622) RexFieldCollation toString method is not deterministic
Jesus Camacho Rodriguez created CALCITE-2622: Summary: RexFieldCollation toString method is not deterministic Key: CALCITE-2622 URL: https://issues.apache.org/jira/browse/CALCITE-2622 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez It iterates over a set to print its flags. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Correctness of SortJoinTransposeRule
The idea for that rule was to be able to exploit the limit/fetch spec of the Sort operator to reduce the number of rows that needed to be joined, that is why it was only applied to LEFT/RIGHT outer join. I think option 2 below sounds better than creating a new rule variant. Thanks, Jesús On 9/6/18, 10:28 AM, "Julian Hyde" wrote: Ah, that makes sense. Reading the code, I couldn’t figure out why it applies to LEFT and RIGHT but not to INNER. (For some kinds of join, for example inner merge join, it could push the sort to both sides, as long as the sort was compatible with what is needed to ensure that the keys arrive at the right time.) If needed, we could have a variant of the rule that omits the Sort after the Join. Or perhaps we leave the Sort and have a rule that notices the output order of the Join and, based on that, weakens[1] or removes the Sort. Julian [1] https://issues.apache.org/jira/browse/CALCITE-2540 <https://issues.apache.org/jira/browse/CALCITE-2540> > On Sep 6, 2018, at 10:08 AM, Jesus Camacho Rodriguez wrote: > > If I remember correctly, the rule pushes the Sort through the Join (if possible), but it also preserves the Sort on top of the Join to ensure correctness. > > -Jesús > > > On 9/6/18, 9:57 AM, "Julian Hyde" wrote: > >Yes, it depends very much on the operator. Some examples: >Merge join typically requires inputs to be sorted, and preserves that order. (But some outer joins may throw in null values out of order.) >Map join typically preserves the order of the probing side, not the build side. >Hash join typically destroys the order of both sides. >Use the rule with caution. > >Julian > > >> On Sep 6, 2018, at 9:33 AM, Stamatis Zampetakis wrote: >> >> Hello, >> >> I noticed that there is a Calcite rule (i.e., SortJoinTransposeRule) that >> pushes a LogicalSort past a LogicalJoin if the join is either left outer or >> right outer. >> >> Who guarantees that the left and right outer joins are preserving the order >> of the inputs? >> Does the SQL standard requires that these types of joins are order >> preserving? >> >> Since we are working with logical operators, I would tend to think that we >> cannot assume anything about the physical equivalent. >> >> Best, >> Stamatis > > >
Re: Correctness of SortJoinTransposeRule
If I remember correctly, the rule pushes the Sort through the Join (if possible), but it also preserves the Sort on top of the Join to ensure correctness. -Jesús On 9/6/18, 9:57 AM, "Julian Hyde" wrote: Yes, it depends very much on the operator. Some examples: Merge join typically requires inputs to be sorted, and preserves that order. (But some outer joins may throw in null values out of order.) Map join typically preserves the order of the probing side, not the build side. Hash join typically destroys the order of both sides. Use the rule with caution. Julian > On Sep 6, 2018, at 9:33 AM, Stamatis Zampetakis wrote: > > Hello, > > I noticed that there is a Calcite rule (i.e., SortJoinTransposeRule) that > pushes a LogicalSort past a LogicalJoin if the join is either left outer or > right outer. > > Who guarantees that the left and right outer joins are preserving the order > of the inputs? > Does the SQL standard requires that these types of joins are order > preserving? > > Since we are working with logical operators, I would tend to think that we > cannot assume anything about the physical equivalent. > > Best, > Stamatis
[jira] [Created] (CALCITE-2465) Enable use of materialized views for any planner
Jesus Camacho Rodriguez created CALCITE-2465: Summary: Enable use of materialized views for any planner Key: CALCITE-2465 URL: https://issues.apache.org/jira/browse/CALCITE-2465 Project: Calcite Issue Type: Improvement Affects Versions: 1.17.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Currently it is only supported in VolcanoPlanner. Using HepPlanner can be useful for debugging purposes. We only need to create/override the relevant method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: SQL Query Set Analyzer
You can find an overview of the work that has been done in Hive for materialized view integration in the following link: https://cwiki.apache.org/confluence/display/Hive/Materialized+views Materialized views can be stored in external tables such as Druid-backed tables too. Druid rules that in Calcite are used to push computation to Druid from Hive. The rewriting algorithm itself is in Calcite. The algorithm can take advantage of constraints (PK-FK relationship between tables) to produce additional correct rewritings, can execute rollups, etc. However, it does not assume any specific schema layout, which may make it useful for multiple ETL workloads. http://calcite.apache.org/docs/materialized_views#rewriting-using-plan-structural-information The most recent addition is the support for partitioned materialized views, including the extension in the cost model to take into account partition pruning during the planning phase. Incremental maintenance is supported. Most of that code lives in Hive, but it relies on the rewriting algorithm too. It only works for materialized views that use Hive transactional tables, either full ACID or insert-only. Basically Hive exposes explicitly the data contained in the materialization via filter condition, e.g., mv1 contains data for transactions (x, y, z), then let the rewriting algorithm trigger a partial rewriting which reads new contents from the sources tables and processed contents from mv1. Finally, an additional step transforms the rewritten expression into an INSERT or MERGE statement depending on the materialized view expression (MERGE for materialized views containing aggregations). Since not all tables in Hive support UPDATE needed for MERGE, we were thinking about allowing some target materialized views with definitions that include aggregates to use INSERT and then force the rollup at runtime, e.g., for Druid. bq. Maybe it depends on the aggregation functions that are used? The result of some aggregate functions cannot be (always) incrementally maintained in the presence of UPDATE/DELETE operations on source tables, e.g., min and max, though some rewriting to minimize full rebuilds can be used if count is added as an additional column to the materialized view. Incremental maintenance in presence of UPDATE/DELETE operations in source tables is not supported in Hive yet, hence this is not implemented. I would like to think that of the problems described below, we are getting to the 'more interesting stuff' in the Hive project, though there is some consolidation needed for existing work too. That is why we are also interested in any effort related to materializations recommendation. I believe the most powerful abstraction to use would be RelNode, which can be useful for any system representing its queries internally using that representation, instead of relying on SQL nodes which are more closely tight to the parser. Concerning the ´feedback loop´, this recent paper by MSFT describes a system that does something similar to what James was describing (for SCOPE): https://www.microsoft.com/en-us/research/uploads/prod/2018/03/cloudviews-sigmod2018.pdf -Jesús On 8/6/18, 3:32 PM, "Julian Hyde" wrote: It’s hard to automatically recommend a set of MVs from past queries. The design space is just too large. But if you are designing MVs for interactive BI, you can use the “lattice” model. This works because many queries will be filter-join-aggregate queries on a star schema (i.e. a central fact table and dimension tables joined over many-to-one relationships). (Or perhaps a join between two or more such queries.) Do the queries you are trying to optimize have that pattern? If so, you might start by creating a lattice for each such star schema. Then the lattice can suggest MVs that are summary tables. (Lattice suggester is one step more meta - it recommends lattices - but given where you are, I would suggest hand-writing one or two lattices.) Calcite is a framework, and this unfortunately means that you have to write Java code to use these features. It might be easier if you use the new “server” module, which supports CREATE MATERIALIZED VIEW as a DDL statement. Then you can create some demos for your colleagues that are wholly or mostly SQL. The simplest way to populate a materialized view is the CREATE MATERIALIZED VIEW statement. It basically does the same as CREATE TABLE AS SELECT (executes a query, stores the results in a table) but it leaves behind the metadata about where that data came from. Materialized views can in principle be maintained incrementally, but how you do it depends upon what changes are allowed (append only? Replace rows and write the old rows to an audit table?). We’ve not done a lot of work on it. I believe the Hive folks have given this more thought than I have. Julian > On Aug 3, 2018, at 11:11 PM, James Taylor wrote:
[jira] [Created] (CALCITE-2387) Fix for date/timestamp cast expressions in Druid adapter
Jesus Camacho Rodriguez created CALCITE-2387: Summary: Fix for date/timestamp cast expressions in Druid adapter Key: CALCITE-2387 URL: https://issues.apache.org/jira/browse/CALCITE-2387 Project: Calcite Issue Type: Bug Components: druid Affects Versions: 1.17.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.17.0 Follow-up for CALCITE-2286. Timezone should be set correctly by Druid when there is a cast for timestamp type vs timestamp with local time zone. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2334) Extend simplification of expressions with CEIL function over date types
Jesus Camacho Rodriguez created CALCITE-2334: Summary: Extend simplification of expressions with CEIL function over date types Key: CALCITE-2334 URL: https://issues.apache.org/jira/browse/CALCITE-2334 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.17.0 Reporter: Jesus Camacho Rodriguez Assignee: Julian Hyde CALCITE-2332 disables simplification of CEIL function due to correctness issues. {{FLOOR}} and {{CEIL}} cannot be handled with the same logic. For instance, {{CEIL(CEIL(x TO HOUR) TO YEAR)}} cannot be simplified, e.g., '2011-12-31 23:59:59', while {{FLOOR(FLOOR(x TO HOUR) TO YEAR)}} can be simplified. Hence, we need new logic to enable simplification of CEIL function on date types. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2300) Improve messages shown in materialized view-based rewriting algorithm
Jesus Camacho Rodriguez created CALCITE-2300: Summary: Improve messages shown in materialized view-based rewriting algorithm Key: CALCITE-2300 URL: https://issues.apache.org/jira/browse/CALCITE-2300 Project: Calcite Issue Type: Improvement Affects Versions: 1.16.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Currently, we do not log or expose any information on why a specific rewriting based on a materialized view was not triggered. Probably we should do that at a very high level of granularity, for instance using the logger TRACE level. This will help 1) debugging possible missing rewriting options, and 2) possibly providing feedback to user if needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2286) Support timestamp type for Druid adapter
Jesus Camacho Rodriguez created CALCITE-2286: Summary: Support timestamp type for Druid adapter Key: CALCITE-2286 URL: https://issues.apache.org/jira/browse/CALCITE-2286 Project: Calcite Issue Type: Improvement Components: druid Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.17.0 CALCITE-1947 replaced {{timestamp}} type for {{timestamp with local time zone}} in Druid adapter. This issue aims at supporting both types for Druid-backed tables and let the user decide the type/semantics they want to use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Druid Adapter - Error Prone checking result
Thanks Kevin! On 4/24/18, 7:43 PM, "Kevin Risden" <kris...@apache.org> wrote: Created CALCITE-2279 Kevin Risden On Tue, Apr 24, 2018 at 1:33 PM, Kevin Risden <kris...@apache.org> wrote: > No JIRA yet. Will create one for this. > > Kevin Risden > > On Tue, Apr 24, 2018, 09:31 Jesus Camacho Rodriguez <jcama...@apache.org> > wrote: > >> Hi Kevin, >> >> Thanks for the feedback. Is there a JIRA for this case? It seems it is a >> bug indeed. >> >> -Jesús >> >> >> On 4/19/18, 8:06 PM, "Kevin Risden" <kris...@apache.org> wrote: >> >> I was looking into Error Prone [1] checking for Calcite and it found >> what >> looks like a bug in Druid Adapter. The output is as follows >> >> DruidJsonFilter.java:[324,9] [IdentityBinaryExpression] A binary >> expression >> > where both operands are the same is usually incorrect; the value of >> this >> > expression is equivalent to `lhs.getType().getFamily() == >> > SqlTypeFamily.NUMERIC`. >> > [ERROR] (see >> > http://errorprone.info/bugpattern/IdentityBinaryExpression) >> >> >> The DruidJsonFilter [2] has left and right hand the exact same. >> Raising >> awareness here before going to JIRA. >> >> [1] http://errorprone.info/ >> [2] >> https://github.com/apache/calcite/blob/master/druid/src/ >> main/java/org/apache/calcite/adapter/druid/DruidJsonFilter.java#L323 >> >> Kevin Risden >> >> >> >>
[jira] [Created] (CALCITE-2277) Skip SemiJoin operator in materialized view-based rewriting algorithm
Jesus Camacho Rodriguez created CALCITE-2277: Summary: Skip SemiJoin operator in materialized view-based rewriting algorithm Key: CALCITE-2277 URL: https://issues.apache.org/jira/browse/CALCITE-2277 Project: Calcite Issue Type: Bug Affects Versions: 1.16.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.17.0 {{AbstractMaterializedViewRule}} is not recognizing the difference between SemiJoin and Join operator, since SemiJoin inherits from Join. This can lead to incorrect rewriting and errors in the rewriting logic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Druid Adapter - Error Prone checking result
Hi Kevin, Thanks for the feedback. Is there a JIRA for this case? It seems it is a bug indeed. -Jesús On 4/19/18, 8:06 PM, "Kevin Risden"wrote: I was looking into Error Prone [1] checking for Calcite and it found what looks like a bug in Druid Adapter. The output is as follows DruidJsonFilter.java:[324,9] [IdentityBinaryExpression] A binary expression > where both operands are the same is usually incorrect; the value of this > expression is equivalent to `lhs.getType().getFamily() == > SqlTypeFamily.NUMERIC`. > [ERROR] (see > http://errorprone.info/bugpattern/IdentityBinaryExpression) The DruidJsonFilter [2] has left and right hand the exact same. Raising awareness here before going to JIRA. [1] http://errorprone.info/ [2] https://github.com/apache/calcite/blob/master/druid/src/main/java/org/apache/calcite/adapter/druid/DruidJsonFilter.java#L323 Kevin Risden
[jira] [Created] (CALCITE-2232) Assertion error on AggregatePullUpConstantsRule
Jesus Camacho Rodriguez created CALCITE-2232: Summary: Assertion error on AggregatePullUpConstantsRule Key: CALCITE-2232 URL: https://issues.apache.org/jira/browse/CALCITE-2232 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.16.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.17.0 Executing the following query: {code:sql} select ename, sal from (select '1', ename, sal from emp where ename = 'John') subq group by ename, sal; {code} results in the following error: {code} java.lang.AssertionError: Cannot add expression of different type to set: set type is RecordType(VARCHAR(20) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary" NOT NULL ENAME, INTEGER NOT NULL SAL) NOT NULL expression type is RecordType(INTEGER NOT NULL ENAME, INTEGER NOT NULL SAL) NOT NULL set is rel#21:LogicalAggregate(input=HepRelVertex#20,group={1, 5}) expression is LogicalProject#24 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSS] Committer duties
IMO the most important task is to stay on top of the issues and PRs, I am not so concerned about the other project management tasks since it is easier to do them collaboratively. I think we do not need owners for components, as it will not help the project in any way. If an owner does not review some PRs, what are we going to do? Effectively, we cannot force him/her to do it timely, and at the same time, we do not want to hold commits to the project till the owner decides to review the PR. Committers are more familiar with each other's work so we (the committers) could try to proactively monitor the mailing list for new issues and help contributors by reviewing or pinging the right reviewer for each of them. Ultimately, it is our responsibility as a community to commit those patches and keep the project moving forward. This means that we may need to step up and review a certain patch as well as we can, even if we are less familiar with a certain module. -Jesús On 3/27/18, 2:33 PM, "F21"wrote: Hey everyone, I am happy to take ownership of the Go avatica client. I am currently quite busy, but I hope to test it against the latest version of avatica released a couple of weeks ago and see if we can make a release for it. Francis On 28/03/2018 6:27 AM, Shuyi Chen wrote: > Hi Julian and Michael, > > Thanks a lot for starting the discussion. I think the ownership model is a > good idea, and has been used by other open source communities, and we can > further break down core into e.g. sql parser, sql validator, relational > algebra, planner, JSON model, runtime and etc,. Also, we need to add the > 'server' module into the JIRA component list for DDLs. And I think adding > component in the PR title will help owner to filter and identify issues > quickly, also I think we can use a template to enforce a more detail PR > description, so the reviewer can better understand the context and review > the code. > > I have some knowledge in sql parser, JSON model, relational algebra and > planner, and is currently working on the server module to add the > type/library/function DDLs. I can definitely help on answering questions on > mailing list, reviewing code and contributing PRs for these components. > Also, I am definitely interested in learning and helping more on committing > code and doing releases as well. > > Cheers > Shuyi > > > On Tue, Mar 27, 2018 at 9:51 AM, Michael Mior wrote: > >> Thanks for starting the discussion Julian. I suggested at some point in the >> past that we figure out people who are willing to take ownership over >> certain components of Calcite. It seems like this would at least be a start >> to staying on top of PRs and issues. However, we would probably have to >> segment core practically for this to help. >> >> Another thing that comes to mind is staying on top of updates to >> dependencies. If people are owning certain components, hopefully they would >> also be willing to do a quick check around release time to see if new >> versions of dependencies for that component have been released and test and >> update if possible. >> >> Then there's also more administrative tasks such as making releases and >> ensuring a good flow of new committers and PMC members. Anything else I'm >> missing? >> >> Cheers, >> -- >> Michael Mior >> mm...@apache.org >> >> 2018-03-27 12:40 GMT-04:00 Julian Hyde : >> >>> I’m not working full-time on Calcite anymore. But this project still >> needs >>> regular — daily — work to stay on top of contributions. If there’s only >> one >>> person doing the work then one will is likely to become zero. >>> >>> Let’s come up with a plan — with some commitments — for how this work >> will >>> get done. >>> >>> Julian >>> >>> > >
Re: [ANNOUNCE] Apache Calcite 1.16.0 released
Hi Sebb, On 3/19/18, 4:32 PM, "sebb" <seb...@gmail.com> wrote: On 19 March 2018 at 19:21, Jesus Camacho Rodriguez <jcama...@apache.org> wrote: > The Apache Calcite team is pleased to announce the release of Apache > Calcite 1.16.0. > > Calcite is a dynamic data management framework. Its cost-based > optimizer converts queries, represented in relational algebra, into > executable plans. Calcite supports many front-end languages and > back-end data engines, and includes an SQL parser and, as a > sub-project, the Avatica JDBC driver. > > This release comes three months after 1.15.0. It includes more than > 80 resolved issues, comprising a large number of new features as > well as general improvements and bug-fixes to Calcite core. > > You can start using it in Maven by simply updating your dependency to: > > > org.apache.calcite > calcite-core > 1.16.0 > > > If you'd like to download the source release, you can find it here: > > http://www.apache.org/dyn/closer.cgi/calcite/apache-calcite-1.16.0/ The website has a good download page here: http://calcite.apache.org/downloads/ This includes links to hashes and sigs and so is more useful than the above URL. It also does not need to be changed with each release. May I request that you use it in future announce emails please? Seems like a good idea, I will change the release documentation. Thanks, Jesús > You can read more about the release (including release notes) here: > > http://calcite.apache.org/news/2018/03/19/release-1.16.0/ > > We welcome your help and feedback. For more information on how to > report problems, and to get involved, visit the project website at: > > http://calcite.apache.org/ > > Thanks to everyone involved! > > Jesus Camacho Rodriguez, on behalf of the Apache Calcite Team > > >
[ANNOUNCE] Apache Calcite 1.16.0 released
The Apache Calcite team is pleased to announce the release of Apache Calcite 1.16.0. Calcite is a dynamic data management framework. Its cost-based optimizer converts queries, represented in relational algebra, into executable plans. Calcite supports many front-end languages and back-end data engines, and includes an SQL parser and, as a sub-project, the Avatica JDBC driver. This release comes three months after 1.15.0. It includes more than 80 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes to Calcite core. You can start using it in Maven by simply updating your dependency to: org.apache.calcite calcite-core 1.16.0 If you'd like to download the source release, you can find it here: http://www.apache.org/dyn/closer.cgi/calcite/apache-calcite-1.16.0/ You can read more about the release (including release notes) here: http://calcite.apache.org/news/2018/03/19/release-1.16.0/ We welcome your help and feedback. For more information on how to report problems, and to get involved, visit the project website at: http://calcite.apache.org/ Thanks to everyone involved! Jesus Camacho Rodriguez, on behalf of the Apache Calcite Team
Re: [VOTE] Release apache-calcite-1.16.0 (release candidate 0)
Hi Julian, Thanks for pointing that out. Indeed I had copied last vote email. The md5 are generated following the release instructions, hence those need to be updated too. Since sha256 is included in the vote email, I think it is sufficient to ask to validate against that hash instead of restarting the vote? If vote passes, I will update the instructions when I merge the branch into master. -Jesús On 3/14/18, 1:23 AM, "Julian Hyde" <jh...@apache.org> wrote: Following https://issues.apache.org/jira/browse/CALCITE-1972 <https://issues.apache.org/jira/browse/CALCITE-1972> I thought we were only going to generate .sha256, not .md5 anymore. (I did this for avatica-1.11 recently, apparently not for calcite1-15 when I was RM.) > On Mar 14, 2018, at 1:14 AM, Jesus Camacho Rodriguez <jcama...@apache.org> wrote: > > Hi all, > > I have created a build for Apache Calcite 1.16.0, release candidate 0. > > Thanks to everyone who has contributed to this release. > You can read the release notes here: > https://github.com/apache/calcite/blob/branch-1.16/site/_docs/history.md > > The commit to be voted upon: > http://git-wip-us.apache.org/repos/asf/calcite/commit/96b73060fdc77db9c13e40f6f82f374bb41d12c8 > > Its hash is 96b73060fdc77db9c13e40f6f82f374bb41d12c8. > > The artifacts to be voted on are located here: > https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.16.0-rc0 > > The hashes of the artifacts are as follows: > src.tar.gz.md5 4849ebf6a1968111aca7c86ace193a9c > src.tar.gz.sha256 92d8c18cdcda81e81ddbd53bab92acdd53b021bc0d452a0b8c21f51ecb1dfcdf > src.zip.md5 0da85ae5de97a0b5d989c140e5d41044 > src.zip.sha256 d096286a00f2cee588485f4bcb9d051eaff4ed2e4874084464ec76eea257a79e > > A staged Maven repository is available for review at: > https://repository.apache.org/content/repositories/orgapachecalcite-1042 > > Release artifacts are signed with the following key: > https://people.apache.org/keys/committer/jcamacho.asc > > Please vote on releasing this package as Apache Calcite 1.16.0. > > The vote is open for the next 72 hours and passes if a majority of > at least three +1 PMC votes are cast. > > [ ] +1 Release this package as Apache Calcite 1.16.0 > [ ] 0 I don't feel strongly about it, but I'm okay with the release > [ ] -1 Do not release this package because... > > > Here is my vote: > > +1 (binding) > > Jesús > > >
[VOTE] Release apache-calcite-1.16.0 (release candidate 0)
Hi all, I have created a build for Apache Calcite 1.16.0, release candidate 0. Thanks to everyone who has contributed to this release. You can read the release notes here: https://github.com/apache/calcite/blob/branch-1.16/site/_docs/history.md The commit to be voted upon: http://git-wip-us.apache.org/repos/asf/calcite/commit/96b73060fdc77db9c13e40f6f82f374bb41d12c8 Its hash is 96b73060fdc77db9c13e40f6f82f374bb41d12c8. The artifacts to be voted on are located here: https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.16.0-rc0 The hashes of the artifacts are as follows: src.tar.gz.md5 4849ebf6a1968111aca7c86ace193a9c src.tar.gz.sha256 92d8c18cdcda81e81ddbd53bab92acdd53b021bc0d452a0b8c21f51ecb1dfcdf src.zip.md5 0da85ae5de97a0b5d989c140e5d41044 src.zip.sha256 d096286a00f2cee588485f4bcb9d051eaff4ed2e4874084464ec76eea257a79e A staged Maven repository is available for review at: https://repository.apache.org/content/repositories/orgapachecalcite-1042 Release artifacts are signed with the following key: https://people.apache.org/keys/committer/jcamacho.asc Please vote on releasing this package as Apache Calcite 1.16.0. The vote is open for the next 72 hours and passes if a majority of at least three +1 PMC votes are cast. [ ] +1 Release this package as Apache Calcite 1.16.0 [ ] 0 I don't feel strongly about it, but I'm okay with the release [ ] -1 Do not release this package because... Here is my vote: +1 (binding) Jesús
[jira] [Created] (CALCITE-2212) Avatica - Enforce Java version via maven-enforcer-plugin
Jesus Camacho Rodriguez created CALCITE-2212: Summary: Avatica - Enforce Java version via maven-enforcer-plugin Key: CALCITE-2212 URL: https://issues.apache.org/jira/browse/CALCITE-2212 Project: Calcite Issue Type: Task Components: avatica, core Reporter: Josh Elser Assignee: Kevin Risden Fix For: 1.16.0 Now that jdk7 support has been dropped, we should add some logic to the build to fail obviously when a version of Java is used that we don't support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSS] Towards Calcite 1.16.0
Hi team, Now that Avatica 1.11.0 has been released (thanks Julian!), it is time to move on with Calcite 1.16.0 release (CALCITE-2181). I would like to create the first RC either tomorrow or Wednesday as we agreed previously. The most critical issue that remains open is CALCITE-2027. Julian, will it make it into the release? Let me know how I can help you with it. -Jesús On 2/28/18, 11:51 AM, "Julian Hyde" <jh...@apache.org> wrote: I’ve reviewed it. You’ve made a great start, and it might make it into 1.16. > On Feb 28, 2018, at 11:02 AM, Shuyi Chen <suez1...@gmail.com> wrote: > > Hi Julian, I've completed the initial PR for > https://issues.apache.org/jira/browse/CALCITE-2045. Could you please take a > look to see if it can make it to 1.16? Thanks a lot. > > Shuyi > > On Tue, Feb 27, 2018 at 1:06 PM, Jesus Camacho Rodriguez < > jcama...@apache.org> wrote: > >> Thanks Julian. >> >> I did not reply to previous email, but plan sounded reasonable to me. I >> will be ready to have the first Calcite RC shortly after Avatica is >> released. >> >> -Jesús >> >> >> On 2/27/18, 11:49 AM, "Julian Hyde" <jh...@apache.org> wrote: >> >>Did we agree a target date for Calcite 1.16? >> >>In any case, I think we’d better do an Avatica release very soon. I >> volunteer to be release manager for Avatica. Can we aim for an RC of >> Avatica 1.11 on Monday (March 5)? >> >>Then, if all goes well, the first RC for Calcite 1.16 could be as >> early as Monday March 12. As RM, I’ll let Jesus drive the Calcite release. >> >>Julian >> >> >>> On Feb 17, 2018, at 2:44 AM, Chris Baynes <ch...@contiamo.com> >> wrote: >>> >>> New release would be great! >>> I'd very much like to get the ClickHouse dialect PR in the next >> release: >>> https://issues.apache.org/jira/browse/CALCITE-2157 >>> though I haven't figured out an easy way to integrate this into the >> calcite >>> test dataset (commented about this in the jira issue). >>> >>> On Sat, Feb 17, 2018 at 9:22 AM, Shuyi Chen <suez1...@gmail.com> >> wrote: >>> >>>> Thanks a lot, Julian. I think I can try to make it before Mid March. >>>> >>>> On Fri, Feb 16, 2018 at 2:46 PM, Julian Hyde <jh...@apache.org> >> wrote: >>>> >>>>> Shuyi, >>>>> >>>>> I forgot to ask: What would be your preferred time-frame for the >> release? >>>>> >>>>> Julian >>>>> >>>>> >>>>>> On Feb 16, 2018, at 10:59 AM, Julian Hyde <jh...@apache.org> >> wrote: >>>>>> >>>>>> My preferred time-frame would be to start a vote on Mar 1 or soon >>>> after. >>>>> So we’d have a release by say Mar 10. That gives us 2 weeks to >> produce an >>>>> Avatica release. But I’m flexible - the release could happen a >> couple of >>>>> weeks later in March. >>>>>> >>>>>> In my queue I have the Geode and Jethro adapters. Both are almost >> ready >>>>> to commit - I just need to get clean test runs. I am interested in >>>>> finishing https://issues.apache.org/jira/browse/CALCITE-2160 < >>>>> https://issues.apache.org/jira/browse/CALCITE-2160> (a grid index >> to >>>>> support spatial joins) or at least some blocker issues regarding >> CROSS >>>>> APPLY that I ran into while working on this. >>>>>> >>>>>> Take a look at the pull request queue https://github.com/apache/ >>>>> calcite/pulls <https://github.com/apache/calcite/pulls> - there >> are a >>>> lot >>>>> of half-complete PRs, more than usual. If the authors want make a >>>> concerted >>>>> effort to complete them before the release they should speak up >> now, and >>>> we >>>>> can be flexible. >>>>>> >>>>>> Julian >>>>>> >>>>>>> On Feb
Re: [VOTE] Release apache-calcite-avatica-1.11.0 (release candidate 0)
+1 (binding) Downloaded, checked hashes, installed, ran tests. Thanks Julian! Jesús On 3/7/18, 8:08 AM, "michael.m...@gmail.com on behalf of Michael Mior"wrote: +1 (binding) Verified hash and checksum of the tarball and compiled and ran tests. I hope we can get another PMC member to test this soon so we can get this release out! -- Michael Mior mm...@apache.org 2018-03-05 23:47 GMT-05:00 Julian Hyde : > Hi all, > > I have created a build for Apache Calcite avatica-1.11.0, release > candidate 0. > > Thanks to everyone who has contributed to this release. > You can read the release notes here: > https://github.com/apache/calcite-avatica/blob/branch- > avatica-1.11/site/_docs/history.md > > The commit to be voted upon: > http://git-wip-us.apache.org/repos/asf/calcite-avatica/commit/ > e533391b9acfb9623a0f00cb40937aee5aa7f2cd > > Its hash is e533391b9acfb9623a0f00cb40937aee5aa7f2cd. > > The artifacts to be voted on are located here: > https://dist.apache.org/repos/dist/dev/calcite/apache- > calcite-avatica-1.11.0-rc0 > > The hashes of the artifacts are as follows: > src.tar.gz.sha256 c671bc18449ec47d1ff67cfffe51eb > 9c08e1b35e5a2382a5b04cebbbcddfcbab > src.zip.sha256 72e566591dc44e79e2aba7219525ab > 09af75f5a2e80e45b526e5aa8e48495c5d > > A staged Maven repository is available for review at: > https://repository.apache.org/content/repositories/orgapachecalcite-1041 > > Release artifacts are signed with the following key: > https://people.apache.org/keys/committer/jhyde.asc > > Please vote on releasing this package as Apache Calcite Avatica 1.11.0. > > The vote is open for the next 72 hours and passes if a majority of > at least three +1 PMC votes are cast. > > [ ] +1 Release this package as Apache Calcite Avatica 1.11.0 > [ ] 0 I don't feel strongly about it, but I'm okay with the release > [ ] -1 Do not release this package because... > > > Here is my vote: > > +1 (binding) > > Julian > >
Re: Updates on Benchmarking and Optimization Research for Calcite
Hi Ashwin, 1) It is important that table/column stats are available, so Calcite can trigger correctly its cost-based optimizations. You can do that either manually by running ANALYZE... COMPUTE STATISTICS FOR COLUMNS statement, or enabling hive.stats.autogather indeed. 2) Calcite-based optimizer is enabled by default, hence you do not need to set any other flag. Calcite will log messages during optimization, so if you set the correct logger level for Calcite (e.g. DEBUG), you will see messages, e.g., with the Calcite rules that have been triggered. In turn, optimization time for every optimization stage is recorded using PerfLogger, so you will be able to see this information in the logs (or you could add your own if you need to). If you had more questions about Hive optimizer vs Calcte in general, I would suggest that you use the Hive dev list to ask them, as you may be able to get more help over there. -Jesús On 3/3/18, 7:40 AM, "AshwinKumar AshwinKumar"wrote: Hello Dev Team, I am trying to run queries on Apache HIVE by setting the flag *hive.cbo.enabled* to true and also to false and then compare the metrics. I have a few questions regarding the same - 1. Do I need to set *hive.stats.autogather(to gather the tables statistics)* to true as well before setting turning on the CBO. 2. Is there any other flags which I need to set to activate the calcite CBO . Also could you please let me know what is best way to obtain any instrumentation data from Calcite process. Thanks, Ashwin On Thu, Mar 1, 2018 at 2:26 AM, Riccardo Tommasini < riccardo.tommas...@polimi.it> wrote: > Hello, > > I can definitely help if you need me to do something. > > And I would also like to join the online meeting. > > Cheers, > > On 20 Feb 2018, 22:13 +0100, Edmon Begoli , wrote: > Just a quick update on the progress of benchmarking setup for Calcite, and > a call to you for feedback and participation: > > 1. We (Ashwin Vajantri. member of my team) has installed Postgres and Hive > on our servers, and he has loaded TPC-DS benchmark data, and ran some test > queries. He also installed Calcite on top of Postgres so we can do > comparisons of performance for through Calcite vs. native. > (we have a full documentation for all this in a Google Doc I shared with > those interested in this work. We'll make if public once complete) > > 2. Another colleague, Dr. Seung-Hwan Lim is ready to look into more > detailed benchmarking and optimization aspects, as well as to look into > other engines that we work with and know -- MapD, Spark, Druid, Cassandra, > or Flink. > > All this so far is based, and in support of following JIRA issues: > https://issues.apache.org/jira/projects/CALCITE/issues/CALCITE-2168 > https://issues.apache.org/jira/projects/CALCITE/issues/CALCITE-2169 > > My question to the community is: > > 1. Does anyone have any feedback on specific queries or engines we want to > target, and start with? > > 2. How can we meaningfully turn on and turn off Hive optimizer to measure > the performance? > > 3. Anyone wants to pitch in help in any area? > > I am planning to schedule an online meeting next week to connect and > discuss for those interested. >
Re: [DISCUSS] Towards Calcite 1.16.0
Thanks Julian. I did not reply to previous email, but plan sounded reasonable to me. I will be ready to have the first Calcite RC shortly after Avatica is released. -Jesús On 2/27/18, 11:49 AM, "Julian Hyde" <jh...@apache.org> wrote: Did we agree a target date for Calcite 1.16? In any case, I think we’d better do an Avatica release very soon. I volunteer to be release manager for Avatica. Can we aim for an RC of Avatica 1.11 on Monday (March 5)? Then, if all goes well, the first RC for Calcite 1.16 could be as early as Monday March 12. As RM, I’ll let Jesus drive the Calcite release. Julian > On Feb 17, 2018, at 2:44 AM, Chris Baynes <ch...@contiamo.com> wrote: > > New release would be great! > I'd very much like to get the ClickHouse dialect PR in the next release: > https://issues.apache.org/jira/browse/CALCITE-2157 > though I haven't figured out an easy way to integrate this into the calcite > test dataset (commented about this in the jira issue). > > On Sat, Feb 17, 2018 at 9:22 AM, Shuyi Chen <suez1...@gmail.com> wrote: > >> Thanks a lot, Julian. I think I can try to make it before Mid March. >> >> On Fri, Feb 16, 2018 at 2:46 PM, Julian Hyde <jh...@apache.org> wrote: >> >>> Shuyi, >>> >>> I forgot to ask: What would be your preferred time-frame for the release? >>> >>> Julian >>> >>> >>>> On Feb 16, 2018, at 10:59 AM, Julian Hyde <jh...@apache.org> wrote: >>>> >>>> My preferred time-frame would be to start a vote on Mar 1 or soon >> after. >>> So we’d have a release by say Mar 10. That gives us 2 weeks to produce an >>> Avatica release. But I’m flexible - the release could happen a couple of >>> weeks later in March. >>>> >>>> In my queue I have the Geode and Jethro adapters. Both are almost ready >>> to commit - I just need to get clean test runs. I am interested in >>> finishing https://issues.apache.org/jira/browse/CALCITE-2160 < >>> https://issues.apache.org/jira/browse/CALCITE-2160> (a grid index to >>> support spatial joins) or at least some blocker issues regarding CROSS >>> APPLY that I ran into while working on this. >>>> >>>> Take a look at the pull request queue https://github.com/apache/ >>> calcite/pulls <https://github.com/apache/calcite/pulls> - there are a >> lot >>> of half-complete PRs, more than usual. If the authors want make a >> concerted >>> effort to complete them before the release they should speak up now, and >> we >>> can be flexible. >>>> >>>> Julian >>>> >>>>> On Feb 16, 2018, at 10:51 AM, Shuyi Chen <suez1...@gmail.com >> suez1...@gmail.com>> wrote: >>>>> >>>>> I am working on https://issues.apache.org/jira/browse/CALCITE-2045 < >>> https://issues.apache.org/jira/browse/CALCITE-2045> (Create >>>>> Type DDL). I was hoping to get it in for 1.16. Do we have the deadline >>> for >>>>> the 1.16 feature cut? Thanks. >>>>> >>>>> On Fri, Feb 16, 2018 at 10:43 AM, Julian Hyde <jh...@apache.org >>> <mailto:jh...@apache.org>> wrote: >>>>> >>>>>> Sounds good. It’s about time for 1.16, and thanks for offering to be >>>>>> release manager. >>>>>> >>>>>> But I’d like to get Avatica 1.11 [1] released first — it’s been 9 >>> months — >>>>>> and make Calcite 1.16 depend on Avatica 1.11. See email thread[2]. >>> Although >>>>>> that thread talks about an Avatica-Go release, that should not be a >>> blocker >>>>>> for Calcite 1.16. For both Avatica and Calcite we should update >>>>>> dependencies. >>>>>> >>>>>> Julian >>>>>> >>>>>> [1] https://issues.apache.org/jira/browse/ CALCITE-2182 < >>> https://issues.apache.org/jira/browse/CALCITE-2182> < >>>>>> https://issues.apache.org/jira/browse/CALCITE-2182 < >>> https://issues.apache.org/jira/browse/CALCITE-2182>> >>>>>> >>>>>> [2] https://lists
[jira] [Created] (CALCITE-2192) RelBuilder might skip creation of Aggregate though it has a column pruning effect
Jesus Camacho Rodriguez created CALCITE-2192: Summary: RelBuilder might skip creation of Aggregate though it has a column pruning effect Key: CALCITE-2192 URL: https://issues.apache.org/jira/browse/CALCITE-2192 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.16.0 Issue can be reproduced with following test: {code:java} @Test public void testAggregate3() { final RelBuilder builder = RelBuilder.create(config().build()); RelNode root = builder.scan("EMP") .aggregate( builder.groupKey(builder.field(1)), builder.aggregateCall(SqlStdOperatorTable.COUNT, false, false, null, "C")) .aggregate( builder.groupKey(builder.field(0))) .build(); assertThat(str(root), is("" + "LogicalProject(ENAME=[$0])\n" + " LogicalAggregate(group=[{1}], C=[COUNT()])\n" + "LogicalTableScan(table=[[scott, EMP]])\n")); } {code} Without fix, builder will generate following plan, which contains an unnecessary field (in Hive, this results in an assertion error in RelFieldTrimmer): {code:java} LogicalAggregate(group=[{1}], C=[COUNT()]) LogicalTableScan(table=[[scott, EMP]]) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2190) Extend SubstitutionVisitor.splitFilter to cover different order of operands
Jesus Camacho Rodriguez created CALCITE-2190: Summary: Extend SubstitutionVisitor.splitFilter to cover different order of operands Key: CALCITE-2190 URL: https://issues.apache.org/jira/browse/CALCITE-2190 Project: Calcite Issue Type: Improvement Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez {{SubstitutionVisitor.splitFilter}} does structural comparison to identify relevant predicates. The method could sort the operands for some expressions in a deterministic way to maximize possible matches. For instance, currently this example yields correct results: {code} condition: x = 1 or y = 2 target:y = 2 or x = 1 -> residue: true {code} However, the following equivalent example fails: {code} condition: x = 1 or y = 2 target:y = 2 or 1 = x {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2189) RelMdAllPredicates fast bail out creates mismatch with RelMdTableReferences
Jesus Camacho Rodriguez created CALCITE-2189: Summary: RelMdAllPredicates fast bail out creates mismatch with RelMdTableReferences Key: CALCITE-2189 URL: https://issues.apache.org/jira/browse/CALCITE-2189 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.16.0 The idea behind the metadata providers introduced in CALCITE-1682 is that we can identify the lineage of the expressions. If we bypass assigning a unique identifier for the table in RelMdAllPredicates because there are no predicates on those tables, then we will end up referencing wrong tables. E.g. {code} select x.sal from (select a.deptno, c.sal from (select * from emp limit 7) as a cross join (select * from dept limit 1) as b cross join (select * from emp where empno = 5 limit 2) as c) as x; {code} Table refs: {{[[CATALOG, SALES, DEPT].#0, [CATALOG, SALES, EMP].#0, [CATALOG, SALES, EMP].#1]}} Extracted predicate without fix (wrong): {{=([CATALOG, SALES, EMP].#0.$0, 5)}} Extracted predicate with fix (correct): {{=([CATALOG, SALES, EMP].#1.$0, 5)}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[DISCUSS] Towards Calcite 1.16.0
Calcite 1.15.0 was released on December 11 (more than two months ago). We have solved over 50 issues since then, hence I think we should start discussing about releasing Calcite 1.16.0 (I can be release manager if nobody else steps up). I have created [1] to keep track. CALCITE-2027 (dropping JDK7 support) was targeted for 1.16.0. Julian, Michael, you have been working on that issue, could you share the current status? Any other particular features that people would like to include? -Jesús [1] https://issues.apache.org/jira/browse/CALCITE-2181
[jira] [Created] (CALCITE-2181) Release Calcite 1.16.0
Jesus Camacho Rodriguez created CALCITE-2181: Summary: Release Calcite 1.16.0 Key: CALCITE-2181 URL: https://issues.apache.org/jira/browse/CALCITE-2181 Project: Calcite Issue Type: Task Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Release Calcite 1.16.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2179) General improvements for materialized view rewriting rule
Jesus Camacho Rodriguez created CALCITE-2179: Summary: General improvements for materialized view rewriting rule Key: CALCITE-2179 URL: https://issues.apache.org/jira/browse/CALCITE-2179 Project: Calcite Issue Type: Improvement Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.16.0 This issue is for extending {{AbstractMaterializedViewRule}} rule: - Support for rolling up date nodes. For instance, rewrite in the following case: {code} Materialization: select "empid", floor(cast('1997-01-20' as timestamp) to month), count(*) + 1 as c, sum("empid") as s from "emps" group by "empid", floor(cast('1997-01-20' as timestamp) to month); Query: select floor(cast('1997-01-20' as timestamp) to year), sum("empid") as s from "emps" group by floor(cast('1997-01-20' as timestamp) to year); {code} - Add flag to enable/disable fast bail out for joins. By default it is true, and thus, we were only creating the rewriting in the minimal subtree of plan operators. For instance: {code} View: (A JOIN B) JOIN C Query: (((A JOIN B) JOIN D) JOIN C) JOIN E {code} We produce it at: {code} ((A JOIN B) JOIN D) JOIN C {code} But not at: {code} (((A JOIN B) JOIN D) JOIN C) JOIN E {code} This is important when the rule is used with the Volcano planner together with other rules, e.g. join reordering, as it prevents that the search space grows unnecessarily. However, if we use the rewriting rule in isolation, fast bail out can lead to missing rewriting opportunities (e.g. for bushy join trees). - Possibility to provide a HepProgram to optimize query branch in union rewritings. Note that when we produce a partial rewriting with a Union, the branch that will execute the (partial) query can be fully rewritten so we can add the compensation predicate. (We cannot do the same for views because the expression might not be computable if the needed subexpressions are not available in the view output). If we use Volcano with a determined set of rules, this might not be needed, hence providing this program is optional. - Multiple small fixes discovered while testing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2178) Extend expression simplifier to work on datetime FLOOR functions
Jesus Camacho Rodriguez created CALCITE-2178: Summary: Extend expression simplifier to work on datetime FLOOR functions Key: CALCITE-2178 URL: https://issues.apache.org/jira/browse/CALCITE-2178 Project: Calcite Issue Type: Improvement Components: core Environment: Extend expression simplifier to support: {code} FLOOR(FLOOR(CAST('2010-01-10 00:00:00' AS TIMESTAMP) TO HOUR) TO DAY) => FLOOR(CAST('2010-01-10 00:00:00' AS TIMESTAMP) TO DAY) {code} Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.16.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2137) Materialized view rewriting not being triggered for some join queries
Jesus Camacho Rodriguez created CALCITE-2137: Summary: Materialized view rewriting not being triggered for some join queries Key: CALCITE-2137 URL: https://issues.apache.org/jira/browse/CALCITE-2137 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.16.0 The issue has to do with the column equivalences mapping for joins with equality predicates for columns that are output by the query or subquery (basically, there is a bug and we do not apply mapping). This results in missing rewriting opportunities as the top expression cannot be mapped from the query to the view. It can be reproduced with the following MV and query in {{MaterializationTest.java}}: MV: {code} select * from "emps" join "dependents" using ("empid"); {code} Query: {code} select "emps"."empid", "dependents"."empid", "emps"."deptno" from "emps" join "dependents" using ("empid") join "depts" "a" on ("emps"."deptno"="a"."deptno") where "emps"."name" = 'Bill'; {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2074) Simplification of AND/OR expressions yields wrong results
Jesus Camacho Rodriguez created CALCITE-2074: Summary: Simplification of AND/OR expressions yields wrong results Key: CALCITE-2074 URL: https://issues.apache.org/jira/browse/CALCITE-2074 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.15.0 Reporter: Jesus Camacho Rodriguez Assignee: Julian Hyde Discovered while testing 1.15.0 RC0 with Hive. It seems this regression was introduced by CALCITE-1995. Consider the following query: {code} select * from ( select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1 = b.id) join t2 c on (a.id2 = b.id) where b.d <= 1 and c.d <= 1 ) z where d1 > 1 or d2 > 1 {code} We end up generating the following plan: {code} HiveProject(id1=[$0], id2=[$1], d1=[$3], d2=[$4]) HiveJoin(condition=[OR(=($3, 1), =($4, 1))], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[AND(=($0, $2), =($1, $2))], joinType=[inner], algorithm=[none], cost=[not available]) HiveFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL($1))]) HiveProject(id1=[$0], id2=[$1]) HiveTableScan(table=[[default.t1]], table:alias=[a]) HiveFilter(condition=[AND(<=($1, 1), IS NOT NULL($0))]) HiveProject(id=[$0], d=[$1]) HiveTableScan(table=[[default.t2]], table:alias=[b]) HiveFilter(condition=[<=($0, 1)]) HiveProject(d=[$1]) HiveTableScan(table=[[default.t2]], table:alias=[c]) {code} Observe that the condition in the top join is not correct. I can reproduce this in {{RexProgramTest.simplifyFilter}} with the following example: {code} // condition "a > 5 or b > 5" // with pre-condition "a <= 5 and b <= 5" // should yield "false" but yields "a = 5 or b = 5" checkSimplifyFilter(or(gt(aRef, literal5),gt(bRef, literal5)), RelOptPredicateList.of(rexBuilder, ImmutableList.of(le(aRef, literal5), le(bRef, literal5))), "false"); {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CALCITE-2062) Prevent creating an Aggregate with grouping columns that are not present in grouping sets
Jesus Camacho Rodriguez created CALCITE-2062: Summary: Prevent creating an Aggregate with grouping columns that are not present in grouping sets Key: CALCITE-2062 URL: https://issues.apache.org/jira/browse/CALCITE-2062 Project: Calcite Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Julian Hyde Reject an Aggregate if there are columns that are not in any of the grouping sets. See proposed fix in https://github.com/julianhyde/calcite/tree/2051-minimal-groupSet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[ANNOUNCE] New Calcite PMC chair: Michael Mior
Calcite community members, I am pleased to announce that we have a new PMC chair and VP. I have resigned, and Michael was duly elected by the PMC and approved unanimously by the Board. Please join me in congratulating Michael! -Jesús
[jira] [Created] (CALCITE-2052) Remove SQL code style from MV documentation
Jesus Camacho Rodriguez created CALCITE-2052: Summary: Remove SQL code style from MV documentation Key: CALCITE-2052 URL: https://issues.apache.org/jira/browse/CALCITE-2052 Project: Calcite Issue Type: Improvement Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.15.0 As it is not rendered properly in website. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CALCITE-2051) Rules using Aggregate might check for simple grouping sets incorrectly
Jesus Camacho Rodriguez created CALCITE-2051: Summary: Rules using Aggregate might check for simple grouping sets incorrectly Key: CALCITE-2051 URL: https://issues.apache.org/jira/browse/CALCITE-2051 Project: Calcite Issue Type: Bug Components: core Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Priority: Critical Fix For: 1.15.0 CALCITE-1069 removed the indicator columns for Aggregate operators. In some places, the indicator boolean check was replaced by the following check: {{aggregate.getGroupSets().size() > 1}}. However, that check is incomplete, it should have been replace by {{aggregate.getGroupType() == Group.SIMPLE}}. For instance : https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/AggregateProjectMergeRule.java#L91 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CALCITE-2012) Replace LocalInterval by Interval in Druid adapter
Jesus Camacho Rodriguez created CALCITE-2012: Summary: Replace LocalInterval by Interval in Druid adapter Key: CALCITE-2012 URL: https://issues.apache.org/jira/browse/CALCITE-2012 Project: Calcite Issue Type: Task Components: druid Affects Versions: 1.14.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.15.0 CALCITE-1617 introduced LocalInterval as a proper way to close the gap between the semantics of SQL timestamp type and Druid instants. After that, CALCITE-1947 introduced 'timestamp with local time zone' type in Calcite and mapped the Druid time column to this type. Thus, we do not need anymore the LocalInterval class and we can use Joda Interval, since the column represents an Instant rather than a LocalDateTime. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: [ANNOUNCE] New committer: Christian Beikov
Congrats Christian, looking forward to continuing collaborating! -Jesús On 10/12/17, 5:08 AM, "Fabian Hueske"wrote: >Congrats Christian! > >Best, Fabian > >2017-10-11 20:11 GMT+02:00 Julian Hyde : > >> On behalf of the PMC I am delighted to announce Christian Beikov as a new >> Calcite committer. >> >> Christian’s first contribution[1] — quite out the blue — was a new adapter >> for Elasticsearch5. Since then he has made various improvement’s to >> Calcite’s support for federation and materialization, and has been active >> in design discussions and helping users on the dev list. >> >> Please give Christian a warm welcome to the project! >> >> Julian >> >> [1] https://github.com/apache/calcite/commits/master?author=beikov < >> https://github.com/apache/calcite/commits/master?author=beikov>
Re: [DISCUSS] Draft board report
Julian, Michael, Thanks for the feedback, I will update the report and submit it to the board. -Jesús On 10/11/17, 11:12 AM, "Julian Hyde" <jh...@apache.org> wrote: >Yes… now that I’ve (one minute ago) announced it! > >> On Oct 11, 2017, at 10:50 AM, Michael Mior <mm...@uwaterloo.ca> wrote: >> >> LGTM, although should Christian Beikov be included as a new committer? >> >> -- >> Michael Mior >> mm...@apache.org >> >> 2017-10-10 22:41 GMT-04:00 Jesus Camacho Rodriguez <jcama...@apache.org>: >> >>> Calcite community, >>> >>> I attach the draft of the report I propose to file for the 10/18 Apache >>> board meeting. >>> >>> Please, let me know if you have any feedback. Also if you are aware of >>> any talks given about Calcite in addition to those that I have added! >>> >>> Thanks, >>> Jesús >>> >>> >>> >>> --- >>> >>> >>> ## Description: >>> Apache Calcite is a highly customizable framework for parsing and >>> planning queries on data in a wide variety of formats. It allows >>> database-like access, and in particular a SQL interface and advanced >>> query optimization, for data not residing in a traditional database. >>> >>> Avatica is a sub-project within Calcite, and provides a framework >>> for building local and remote JDBC and ODBC database drivers. Avatica >>> has an independent release schedule, and since April 2017, it has its >>> own independent repository. >>> >>> >>> ## Issues: >>> >>> - There are no issues requiring board attention at this time. >>> >>> ## Activity: >>> >>> Development and mailing list activity is steady for both Calcite and >>> its Avatica sub-project. >>> >>> Since the last board meeting, there has been one Calcite release. >>> >>> Calcite 1.14.0 was released at the beginning of October. The release >>> included more than 60 new features, improvements and bug fixes. Among >>> others, the GEOMETRY data type was added along with 35 associated >>> functions part of the OpenGIS set. Thus, projects integrating with >>> Calcite can benefit from this initial spatial support. In addition, >>> a new adapter for Elasticsearch5 was added. The adapter was an >>> unexpected contribution by a new member in the community, which is >>> representative of the benefits of open-source software development. >>> Finally, a cool new feature was added: the "sqlsh" command, which >>> allows you to run SQL queries against your OS from your shell! >>> >>> >>> Our community continued growing this quarter, as a new committer >>> (Chris Baynes) was added to the project. >>> >>> Finally, there was an important presence of the Apache Calcite project >>> in talks at multiple events, such as FlinkForward 2017 (Berlin, Germany) >>> and DataWorks Summit APAC 2017 (Sydney, Australia). >>> >>> >>> ## Health report: >>> >>> Activity levels on mailing lists, git and JIRA are normal for both >>> Calcite and Avatica. >>> >>> >>> ## PMC changes: >>> >>> - Currently 16 PMC members. >>> - No new PMC members added in the last 3 months >>> - Last PMC addition was Michael Mior on Mon Apr 03 2017 >>> >>> ## Committer base changes: >>> >>> - Currently 26 committers. >>> - Chris Baynes was added as a committer on Wed Aug 23 2017 >>> >>> ## Releases: >>> >>> - 1.14.0 was released on Sun Oct 01 2017 >>> >>> ## JIRA activity: >>> >>> - 122 JIRA tickets created in the last 3 months >>> - 70 JIRA tickets closed/resolved in the last 3 months >>> >>> >>> >>> >
[DISCUSS] Draft board report
Calcite community, I attach the draft of the report I propose to file for the 10/18 Apache board meeting. Please, let me know if you have any feedback. Also if you are aware of any talks given about Calcite in addition to those that I have added! Thanks, Jesús --- ## Description: Apache Calcite is a highly customizable framework for parsing and planning queries on data in a wide variety of formats. It allows database-like access, and in particular a SQL interface and advanced query optimization, for data not residing in a traditional database. Avatica is a sub-project within Calcite, and provides a framework for building local and remote JDBC and ODBC database drivers. Avatica has an independent release schedule, and since April 2017, it has its own independent repository. ## Issues: - There are no issues requiring board attention at this time. ## Activity: Development and mailing list activity is steady for both Calcite and its Avatica sub-project. Since the last board meeting, there has been one Calcite release. Calcite 1.14.0 was released at the beginning of October. The release included more than 60 new features, improvements and bug fixes. Among others, the GEOMETRY data type was added along with 35 associated functions part of the OpenGIS set. Thus, projects integrating with Calcite can benefit from this initial spatial support. In addition, a new adapter for Elasticsearch5 was added. The adapter was an unexpected contribution by a new member in the community, which is representative of the benefits of open-source software development. Finally, a cool new feature was added: the "sqlsh" command, which allows you to run SQL queries against your OS from your shell! Our community continued growing this quarter, as a new committer (Chris Baynes) was added to the project. Finally, there was an important presence of the Apache Calcite project in talks at multiple events, such as FlinkForward 2017 (Berlin, Germany) and DataWorks Summit APAC 2017 (Sydney, Australia). ## Health report: Activity levels on mailing lists, git and JIRA are normal for both Calcite and Avatica. ## PMC changes: - Currently 16 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Michael Mior on Mon Apr 03 2017 ## Committer base changes: - Currently 26 committers. - Chris Baynes was added as a committer on Wed Aug 23 2017 ## Releases: - 1.14.0 was released on Sun Oct 01 2017 ## JIRA activity: - 122 JIRA tickets created in the last 3 months - 70 JIRA tickets closed/resolved in the last 3 months
Re: [ANNOUNCE] Apache Calcite 1.14.0 released
Great! Thanks for all the work to make this release happen and improve the process documentation Michael! -Jesús On 10/4/17, 2:17 PM, "Michael Mior"wrote: >The Apache Calcite team is pleased to announce the release of Apache >Calcite 1.14.0. > >Calcite is a dynamic data management framework. Its cost-based optimizer >converts queries, represented in relational algebra, into executable plans. >Calcite supports many front-end languages and back-end data engines, and >includes an SQL parser and, as a sub-project, the Avatica JDBC driver. > >This release comes three months after 1.13.0. It includes many improvements >and bug fixes. This release brings some big new features. The GEOMETRY data >type was added along with 35 associated functions as the start of support >for Simple Feature Access. There are also new adapters for ES5 and OS data >access. > >You can start using it in Maven by simply updating your dependency to: > > >org.apache.calcite >calcite-core >1.14.0 > > >If you'd like to download the source release, you can find it here: > > http://www.apache.org/dyn/closer.cgi/calcite/apache-calcite-1.14.0/ > >You can read more about the release (including release notes) here: > > http://calcite.apache.org/news/2017/10/02/release-1.14.0/ > >We welcome your help and feedback. For more information on how to >report problems, and to get involved, visit the project website at: > > http://calcite.apache.org/ > >Thanks to everyone involved! > >Michael Mior, on behalf of the Apache Calcite Team
Re: 1.14.0 Release candidate
Let me try to repro the Cassandra problem, I will update the JIRA issue shortly. -Jesús On 9/13/17, 11:29 AM, "michael.m...@gmail.com on behalf of Michael Mior"wrote: >That's the major blocker here. I don't want to make a release when the >Cassandra adapter is completely broken for me. Although if others can >confirm that there are no issues, then maybe we should proceed and I'll >figure out the issues with my environment later. > >There are also a couple integration test failures on MongoDB that I'd like >to resolve, but at least one of those seems spurious as the generated plan >also seems ok. I'll open up JIRA for those and any failures I encounter >with details in case someone else is able to have a quick look. > >Other than test failures, I don't believe there are any outstanding bug >fixes or PRs that need to be merged. As mentioned earlier, if someone could >run integration tests for Oracle, that would be great. I had a brief look >over the Coverity scan results and didn't see anything that looks worth >blocking over, but I think it would be a good idea to have a more thorough >review in the future and also to set up some models to prune spurious >warnings so it's easier to take action in the future. > >-- >Michael Mior >mm...@apache.org > >2017-09-13 14:10 GMT-04:00 Julian Hyde : > >> How close are we to a release candidate? Commits to master are paused, >> so let's either make an RC soon or re-open the branch to commits. >> >> Michael and I seem to be deadlocked on >> https://issues.apache.org/jira/browse/CALCITE-1981. Michael gets a >> build error every time, I never get a build error or a runtime error. >> (I got one once - due to stale jars on my class path, I think.) >> >> Can someone else please try to reproduce the build error? Then we'll >> know whether it's me or Michael who has a messed-up environment. >> >> Julian >> >> >> On Fri, Sep 8, 2017 at 1:59 PM, Julian Hyde wrote: >> > I don’t think there is anyone who has knowledge of MongoDB and time to >> make the fixes. >> > >> >> On Sep 7, 2017, at 10:06 AM, Michael Mior wrote: >> >> >> >> Thanks. I'm also seeing some integration test failures for the MongoDB >> >> adapter. If someone more familiar with Mongo could check that out, that >> >> would be great. >> >> >> >> -- >> >> Michael Mior >> >> mm...@apache.org >> >> >> >> 2017-09-07 12:51 GMT-04:00 Julian Hyde : >> >> >> >>> OK, doing that now. The coverity config is a little rusty, so it may >> >>> take a while before I have it working. I'll let you know. >> >>> >> >>> I discovered a couple of tests that were failing on windows, so I'll >> >>> be committing fixes for those also. >> >>> >> >>> Other PRs should wait until after the release. >> >>> >> >>> On Wed, Sep 6, 2017 at 6:21 PM, Michael Mior >> wrote: >> Could you also trigger a new Coverity scan? I don't have write access >> to >> your repo. >> >> -- >> Michael Mior >> mm...@apache.org >> >> 2017-09-06 15:50 GMT-04:00 Julian Hyde : >> >> > I’m good to go. >> > >> > Today, I will check that Calcite build / test on still works Windows. >> > >> > Julian >> > >> >> On Sep 6, 2017, at 11:48 AM, Michael Mior >> wrote: >> >> >> >> As far as I know, all changes people are hoping to have in the >> 1.14.0 >> >> release have landed. Please speak up if that is not the case, >> >>> otherwise >> > I'm >> >> hoping to prepare RC0 for tomorrow. >> >> >> >> -- >> >> Michael Mior >> >> mm...@apache.org >> > >> > >> >>> >> > >>
[jira] [Created] (CALCITE-1983) Push EQUALS and NOT EQUALS operations with numeric cast on dimensions
Jesus Camacho Rodriguez created CALCITE-1983: Summary: Push EQUALS and NOT EQUALS operations with numeric cast on dimensions Key: CALCITE-1983 URL: https://issues.apache.org/jira/browse/CALCITE-1983 Project: Calcite Issue Type: Improvement Components: druid Affects Versions: 1.15.0 Reporter: slim bouguerra Assignee: Jesus Camacho Rodriguez Fix For: 1.15.0 For instance, the following query should be pushed as a time-series query with filters. {code} use ssb_druid; Time taken: 0.229 seconds, Fetched: 24 row(s) hive> > explain select > sum(discounted_price) > from > ssb_druid_day > where > lo_quantity = 25 ; OK Plan optimized by CBO. Vertex dependency in root stage Reducer 2 <- Map 1 (SIMPLE_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Reducer 2 vectorized, llap File Output Operator [FS_14] Group By Operator [GBY_13] (rows=1 width=8) Output:["_col0"],aggregations:["sum(VALUE._col0)"] <-Map 1 [SIMPLE_EDGE] vectorized, llap SHUFFLE [RS_12] Group By Operator [GBY_11] (rows=1 width=8) Output:["_col0"],aggregations:["sum(discounted_price)"] Select Operator [SEL_10] (rows=1 width=0) Output:["discounted_price"] Filter Operator [FIL_9] (rows=1 width=0) predicate:(UDFToDouble(lo_quantity) = 25.0) TableScan [TS_0] (rows=589709 width=0) ssb_druid@ssb_druid_day,ssb_druid_day,Tbl:PARTIAL,Col:NONE,Output:["lo_quantity","discounted_price"],properties:{"druid.query.json":"{\"queryType\":\"select\",\"dataSource\":\"ssb_druid_day\",\"descending\":false,\"intervals\":[\"1900-01-01T00:00:00.000/3000-01-01T00:00:00.000\"],\"dimensions\":[\"c_city\",\"c_nation\",\"c_region\",\"d_weeknuminyear\",\"d_year\",\"d_yearmonth\",\"d_yearmonthnum\",\"lo_discount\",\"lo_quantity\",\"p_brand1\",\"p_category\",\"p_mfgr\",\"s_city\",\"s_nation\",\"s_region\"],\"metrics\":[\"lo_revenue\",\"discounted_price\",\"net_revenue\"],\"granularity\":\"all\",\"pagingSpec\":{\"threshold\":16384,\"fromNext\":true},\"context\":{\"druid.query.fetch\":false}}","druid.query.type":"select"} {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CALCITE-1982) NPE simplifying range expressions when literal value is null
Jesus Camacho Rodriguez created CALCITE-1982: Summary: NPE simplifying range expressions when literal value is null Key: CALCITE-1982 URL: https://issues.apache.org/jira/browse/CALCITE-1982 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.14.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.14.0 The problem is that we do not check for {{null}} literal values before going into the method that tries to simplify the ranges. {code} java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191) ~[guava-14.0.1.jar:?] at com.google.common.collect.Cut$AboveValue.(Cut.java:301) ~[guava-14.0.1.jar:?] at com.google.common.collect.Cut.aboveValue(Cut.java:296) ~[guava-14.0.1.jar:?] at com.google.common.collect.Range.atMost(Range.java:249) ~[guava-14.0.1.jar:?] at org.apache.calcite.rex.RexSimplify.processRange(RexSimplify.java:849) ~[calcite-core-1.13.0.jar:1.13.0] at org.apache.calcite.rex.RexSimplify.simplifyAnd2ForUnknownAsFalse(RexSimplify.java:668) ~[calcite-core-1.13.0.jar:1.13.0] at org.apache.calcite.rex.RexSimplify.simplifyAnds(RexSimplify.java:226) ~[calcite-core-1.13.0.jar:1.13.0] at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:811) ~[calcite-core-1.13.0.jar:1.13.0] at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:801) ~[calcite-core-1.13.0.jar:1.13.0] ... {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: Javadoc errors
Sure, I will. Thanks for fixing these Julian. -Jesús On 9/8/17, 1:44 PM, "Julian Hyde" <jh...@apache.org> wrote: >And I discovered two more breakages from that change. > >SqlTypeFamily references java.sql.Types.TIMESTAMP_WITH_TIMEZONE, which doesn’t >exist in JDK 1.7, which breaks the build under JDK 1.7. > >And Long(long) constructor is deprecated in JDK 1.9, which generates a >deprecation warning (which we treat as a build error) under JDK 1.9. > >I’ll fix them. > >But this close to the release, please run under JDK 1.7, 1.8 and 1.9, >including javadoc and checking for deprecation warnings, before every commit. > >Julian > > >> On Sep 7, 2017, at 1:50 PM, Jesus Camacho Rodriguez <jcama...@apache.org> >> wrote: >> >> Julian, >> >> I cannot repro in my environment. Could you share the errors that you are >> seeing? >> >> Thanks, >> >> -Jesús >> >> >> >> On 9/7/17, 1:12 PM, "Jesus Camacho Rodriguez" >> <jcamachorodrig...@hortonworks.com on behalf of jcama...@apache.org> wrote: >> >>> Sure, let me take a look. >>> >>> -Jesús >>> >>> >>> >>> >>> On 9/7/17, 10:44 AM, "Julian Hyde" <jh...@apache.org> wrote: >>> >>>> Jesus, >>>> >>>> I am seeing javadoc errors when I run "mvn site" (under JDK 1.8 on >>>> Windows, as it happens) that are very likely from your CALCITE-1947 >>>> commit. Can you fix ASAP. >>>> >>>> Julian >>>> >>> >> >
Re: Javadoc errors
Thanks Julian, I have pushed the fix. -Jesús On 9/7/17, 2:01 PM, "Julian Hyde" <jh...@apache.org> wrote: >Here is the output I get. Make sure you are using JDK 1.8. > >[ERROR] >/home/jhyde/open1/calcite.2/core/src/test/java/org/apache/calcite/rex/RexBuilderTest.java:182: >error: reference not found >[ERROR] * {@link >RexBuilder#makeTimestampWithLocalTimeZoneLiteral(TimestampWithTimeZoneString, >int)}. */ >[ERROR] ^ >[ERROR] >[ERROR] Command line was: >/old/usr/lib/jvm/jdk1.8.0_121/jre/../bin/javadoc @options @packages >@argfile > >Julian > > >On Thu, Sep 7, 2017 at 1:50 PM, Jesus Camacho Rodriguez ><jcama...@apache.org> wrote: >> Julian, >> >> I cannot repro in my environment. Could you share the errors that you are >> seeing? >> >> Thanks, >> >> -Jesús >> >> >> >> On 9/7/17, 1:12 PM, "Jesus Camacho Rodriguez" >> <jcamachorodrig...@hortonworks.com on behalf of jcama...@apache.org> wrote: >> >>>Sure, let me take a look. >>> >>>-Jesús >>> >>> >>> >>> >>>On 9/7/17, 10:44 AM, "Julian Hyde" <jh...@apache.org> wrote: >>> >>>>Jesus, >>>> >>>>I am seeing javadoc errors when I run "mvn site" (under JDK 1.8 on >>>>Windows, as it happens) that are very likely from your CALCITE-1947 >>>>commit. Can you fix ASAP. >>>> >>>>Julian >>>> >>> >> >
Re: Javadoc errors
Julian, I cannot repro in my environment. Could you share the errors that you are seeing? Thanks, -Jesús On 9/7/17, 1:12 PM, "Jesus Camacho Rodriguez" <jcamachorodrig...@hortonworks.com on behalf of jcama...@apache.org> wrote: >Sure, let me take a look. > >-Jesús > > > > >On 9/7/17, 10:44 AM, "Julian Hyde" <jh...@apache.org> wrote: > >>Jesus, >> >>I am seeing javadoc errors when I run "mvn site" (under JDK 1.8 on >>Windows, as it happens) that are very likely from your CALCITE-1947 >>commit. Can you fix ASAP. >> >>Julian >> >
Re: Javadoc errors
Sure, let me take a look. -Jesús On 9/7/17, 10:44 AM, "Julian Hyde"wrote: >Jesus, > >I am seeing javadoc errors when I run "mvn site" (under JDK 1.8 on >Windows, as it happens) that are very likely from your CALCITE-1947 >commit. Can you fix ASAP. > >Julian >
Re: Materialization performance
LGTM, I think by the time we have support for the outer joins, I might have had time to finish the filter tree index implementation too. -Jesús On 8/29/17, 3:11 AM, "Christian Beikov" <christian.bei...@gmail.com> wrote: >I'd like to stick to trying to figure out how to support outer joins for >now and when I have an implementation for that, I'd look into the filter >tree index if you haven't done it by then. > > >Mit freundlichen Grüßen, > >*Christian Beikov* >Am 28.08.2017 um 20:01 schrieb Jesus Camacho Rodriguez: >> Christian, >> >> The implementation of the filter tree index is what I was referring to >> indeed. In the initial implementation I focused on the rewriting coverage, >> but now that the first part is finished, it is at the top of my list as >> I think it is critical to make the whole query rewriting algorithm work >> at scale. However, I have not started yet. >> >> The filter tree index will help to filter not only based on the tables used >> by a given query, but also for queries that do not meet the equivalence >> classes conditions, filter conditions, etc. We could implement all the >> preconditions mentioned in the paper, and we could add our own additional >> ones. I also think that in a second version, we might need to maybe add >> some kind of ranking/limit as many views might meet the preconditions for >> a given query. >> >> It seems you understood how it should work, so if you could help to >> quickstart that work by maybe implementing a first version of the filter >> tree index with a couple of basic conditions (table matching and EC >> matching?), >> that would be great. I could review any of the contributions you make. >> >> -Jesús >> >> >> >> >> >> On 8/28/17, 3:22 AM, "Christian Beikov" <christian.bei...@gmail.com> wrote: >> >>> If the metadata was cached, that would be awesome, especially because >>> that would also improve the prformance regarding the metadata retrival >>> for the query currently being planned, although I am not sure how the >>> caching would work since the RelNodes are mutable. >>> >>> Have you considered implementing the filter tree index explained in the >>> paper? As far as I understood, the whole thing only works when a >>> redundant table elimination is implemented. Is that the case? If so, or >>> if it can be done easily, I'd propose we initialize all the lookup >>> structures during registration and use them during planning. This will >>> improve planning time drastically and essentially handle the scalability >>> problem you mention. >>> >>> What other MV-related issues are on your personal todo list Jesus? I >>> read the paper now and think I can help you in one place or another if >>> you want. >>> >>> >>> Mit freundlichen Grüßen, >>> >>> *Christian Beikov* >>> Am 28.08.2017 um 08:13 schrieb Jesus Camacho Rodriguez: >>>> Hive does not use the Calcite SQL parser, thus we follow a different path >>>> and did not experience the problem on the Calcite end. However, FWIW we >>>> avoided reparsing the SQL every time a query was being planned by >>>> creating/managing our own cache too. >>>> >>>> The metadata providers implement some caching, thus I would expect that >>>> once >>>> you avoid reparsing every MV, the retrieval time of predicates, lineage, >>>> etc. >>>> would improve (at least after using the MV for the first time). However, >>>> I agree that the information should be inferred when the MV is loaded. >>>> In fact, maybe just making some calls to the metadata providers while the >>>> MVs >>>> are being loaded would do the trick (Julian should confirm this). >>>> >>>> Btw, probably you will find another scalability issue as the number of MVs >>>> grows large with the current implementation of the rewriting, since the´ >>>> pre-filtering implementation in place does not discard many of the views >>>> that >>>> are not valid to rewrite a given query, and rewriting is attempted with all >>>> of them. >>>> This last bit is work that I would like to tackle shortly, but I have not >>>> created the corresponding JIRA yet. >>>> >>>> -Jesús >>>> >>&g
Re: Wondering if there is an estimate for calcite 1.14 release
I agree that we should create a new release soon to continue with our normal release cadence. I would like CALCITE-1947 to go in before the release, I should push it this week. @Michael, AFAIK you need to be a committer to be the RM. -Jesús On 8/25/17, 6:27 PM, "LogSplitter"wrote: >Hi, > >Do you have to be a committer to be release manager. > >Whats involved. I will volunteer anyway :-) > >Michael. > > >On 08/25/2017 04:07 PM, Julian Hyde wrote: >> It's about time we did a release. 1.13 was exactly 2 months ago. >> >> Does anyone have any constraints on the release date? (Must be before >> X, must be after Y?) >> >> Any volunteers to be release manager? >> >> Julian >> >> >> On Fri, Aug 25, 2017 at 3:07 PM, LogSplitter wrote: >>> Hello, >>> >>> I was wondering if there is any estimated of when calcite 1.14 might be >>> released. >>> >>> I am tossing up whether I use a patched version 1.13 release or wait a >>> little longer for real 1.14 for work I am needing to ship? >>> >>> thanks >>> Michael >
Re: Materialization performance
Christian, The implementation of the filter tree index is what I was referring to indeed. In the initial implementation I focused on the rewriting coverage, but now that the first part is finished, it is at the top of my list as I think it is critical to make the whole query rewriting algorithm work at scale. However, I have not started yet. The filter tree index will help to filter not only based on the tables used by a given query, but also for queries that do not meet the equivalence classes conditions, filter conditions, etc. We could implement all the preconditions mentioned in the paper, and we could add our own additional ones. I also think that in a second version, we might need to maybe add some kind of ranking/limit as many views might meet the preconditions for a given query. It seems you understood how it should work, so if you could help to quickstart that work by maybe implementing a first version of the filter tree index with a couple of basic conditions (table matching and EC matching?), that would be great. I could review any of the contributions you make. -Jesús On 8/28/17, 3:22 AM, "Christian Beikov" <christian.bei...@gmail.com> wrote: >If the metadata was cached, that would be awesome, especially because >that would also improve the prformance regarding the metadata retrival >for the query currently being planned, although I am not sure how the >caching would work since the RelNodes are mutable. > >Have you considered implementing the filter tree index explained in the >paper? As far as I understood, the whole thing only works when a >redundant table elimination is implemented. Is that the case? If so, or >if it can be done easily, I'd propose we initialize all the lookup >structures during registration and use them during planning. This will >improve planning time drastically and essentially handle the scalability >problem you mention. > >What other MV-related issues are on your personal todo list Jesus? I >read the paper now and think I can help you in one place or another if >you want. > > >Mit freundlichen Grüßen, >-------- >*Christian Beikov* >Am 28.08.2017 um 08:13 schrieb Jesus Camacho Rodriguez: >> Hive does not use the Calcite SQL parser, thus we follow a different path >> and did not experience the problem on the Calcite end. However, FWIW we >> avoided reparsing the SQL every time a query was being planned by >> creating/managing our own cache too. >> >> The metadata providers implement some caching, thus I would expect that once >> you avoid reparsing every MV, the retrieval time of predicates, lineage, etc. >> would improve (at least after using the MV for the first time). However, >> I agree that the information should be inferred when the MV is loaded. >> In fact, maybe just making some calls to the metadata providers while the MVs >> are being loaded would do the trick (Julian should confirm this). >> >> Btw, probably you will find another scalability issue as the number of MVs >> grows large with the current implementation of the rewriting, since the´ >> pre-filtering implementation in place does not discard many of the views that >> are not valid to rewrite a given query, and rewriting is attempted with all >> of them. >> This last bit is work that I would like to tackle shortly, but I have not >> created the corresponding JIRA yet. >> >> -Jesús >> >> >> >> >> On 8/27/17, 10:43 PM, "Rajat Venkatesh" <rvenkat...@qubole.com> wrote: >> >>> Thread Safety and repeated parsing is a problem. We have experience with >>> managing 10s of materialized views. Repeated parsing takes more time than >>> execution of the query itself. We also have a similar problem where >>> concurrent queries (with a different set of materialized views potentailly) >>> maybe planned at the same time. We solved it through maintaining a cache >>> and carefully setting the cache in a thread local. >>> Relevant code for inspiration: >>> https://github.com/qubole/quark/blob/master/optimizer/src/main/java/org/apache/calcite/prepare/Materializer.java >>> https://github.com/qubole/quark/blob/master/optimizer/src/main/java/org/apache/calcite/plan/QuarkMaterializeCluster.java >>> >>> >>> >>> On Sun, Aug 27, 2017 at 6:50 PM Christian Beikov >>> <christian.bei...@gmail.com> >>> wrote: >>> >>>> Hey, I have been looking a bit into how materialized views perform >>>> during the planning because of a very long test >>>> run(MaterializationTest#testJoinMaterializationUKFK6) and the c
Re: Materialization performance
Hive does not use the Calcite SQL parser, thus we follow a different path and did not experience the problem on the Calcite end. However, FWIW we avoided reparsing the SQL every time a query was being planned by creating/managing our own cache too. The metadata providers implement some caching, thus I would expect that once you avoid reparsing every MV, the retrieval time of predicates, lineage, etc. would improve (at least after using the MV for the first time). However, I agree that the information should be inferred when the MV is loaded. In fact, maybe just making some calls to the metadata providers while the MVs are being loaded would do the trick (Julian should confirm this). Btw, probably you will find another scalability issue as the number of MVs grows large with the current implementation of the rewriting, since the´ pre-filtering implementation in place does not discard many of the views that are not valid to rewrite a given query, and rewriting is attempted with all of them. This last bit is work that I would like to tackle shortly, but I have not created the corresponding JIRA yet. -Jesús On 8/27/17, 10:43 PM, "Rajat Venkatesh"wrote: >Thread Safety and repeated parsing is a problem. We have experience with >managing 10s of materialized views. Repeated parsing takes more time than >execution of the query itself. We also have a similar problem where >concurrent queries (with a different set of materialized views potentailly) >maybe planned at the same time. We solved it through maintaining a cache >and carefully setting the cache in a thread local. >Relevant code for inspiration: >https://github.com/qubole/quark/blob/master/optimizer/src/main/java/org/apache/calcite/prepare/Materializer.java >https://github.com/qubole/quark/blob/master/optimizer/src/main/java/org/apache/calcite/plan/QuarkMaterializeCluster.java > > > >On Sun, Aug 27, 2017 at 6:50 PM Christian Beikov >wrote: > >> Hey, I have been looking a bit into how materialized views perform >> during the planning because of a very long test >> run(MaterializationTest#testJoinMaterializationUKFK6) and the current >> state is problematic. >> >> CalcitePrepareImpl#getMaterializations always reparses the SQL and down >> the line, there is a lot of expensive work(e.g. predicate and lineage >> determination) done during planning that could easily be pre-calculated >> and cached during materialization creation. >> >> There is also a bit of a thread safety problem with the current >> implementation. Unless there is a different safety mechanism that I >> don't see, the sharing of the MaterializationService and thus also the >> maps in MaterializationActor via a static instance between multiple >> threads is problematic. >> >> Since I mentioned thread safety, how is Calcite supposed to be used in a >> multi-threaded environment? Currently I use a connection pool that >> initializes the schema on new connections, but that is not really nice. >> I suppose caches are also bound to the connection? A thread safe context >> that can be shared between connections would be nice to avoid all that >> repetitive work. >> >> Are these known issues which you have thought about how to fix or should >> I log JIRAs for these and fix them to the best of my knowledge? I'd more >> or less keep the service shared but would implement it using a copy on >> write strategy since I'd expect seldom schema changes after startup. >> >> Regarding the repetitive work that partly happens during planning, I'd >> suggest doing that during materialization registration instead like it >> is already mentioned CalcitePrepareImpl#populateMaterializations. Would >> that be ok? >> >> -- >> >> Mit freundlichen Grüßen, >> >> *Christian Beikov* >>
Re: [DISCUSS] Draft board report
Thanks for the feedback! I will update the report and send it to the board by EOD. Thanks, Jesús On 7/12/17, 5:24 PM, "Josh Elser" <els...@apache.org> wrote: > > >On 7/11/17 7:37 PM, Julian Hyde wrote: >> Looks good. I would restore the line >> >> - Last PMC addition was Michael Mior on Mon Apr 03 2017 >> >> because the Board likes to monitor PMC & committer development. > >+1 Will definitely get a ding from them without that :) > >> I also get a sense that we are attracting a more diverse set of >> contributors than usual. The last 100 commits had 29 distinct >> contributors[1]. Typically each 100 commits has around 20 distinct >> contributors. > >Also +1. While I'm not too involved with the actual commits landing on >the Calcite side, I'm been tickled with all of the new names and >use-cases showing up on dev@. Definitely a sign of healthy growth! > >> Thanks for writing the report! >> >> Julian >> >> [1] git log origin/master |grep Author|awk 'FNR < 100 {print}'|sort -u|wc >> >> On Tue, Jul 11, 2017 at 2:49 PM, Jesus Camacho Rodriguez >> <jcama...@apache.org> wrote: >>> Calcite community, >>> >>> I attach the draft of the report I propose to file for the 7/19 Apache >>> board meeting. >>> >>> Please, let me know if you have any feedback. >>> >>> Thanks, >>> Jesús >>> >>> >>> --- >>> >>> >>> Attachment O: Report from the Apache Calcite Project [Jesús Camacho >>> Rodríguez] >>> >>> ## Description: >>> Apache Calcite is a highly customizable framework for parsing and >>> planning queries on data in a wide variety of formats. It allows >>> database-like access, and in particular a SQL interface and advanced >>> query optimization, for data not residing in a traditional database. >>> >>> Avatica is a sub-module within Calcite, and provides a framework >>> for building local and remote JDBC and ODBC database drivers. Avatica >>> has an independent release schedule, and since April 2017, it has its >>> own independent repository. >>> >>> ## Issues: >>> - There are no issues requiring board attention at this time. >>> >>> ## Activity: >>> >>> Development and mailing list activity is steady for both Calcite and >>> its Avatica sub-project. >>> >>> Since the last board meeting, there has been one Calcite release >>> and one Avatica release. >>> >>> Avatica 1.10.0 was released at the end of May. As the Calcite and >>> Avatica projects become more separate, this was the first release >>> since Avatica’s git repository separated from Calcite’s repository >>> during the previous quarter. The release added support for JDBC Array >>> data, Docker, and JDK 9 (it continues to run on JDK 7 and 8). >>> In total, there were over 20 new features and bug fixes. >>> >>> In turn, Calcite 1.13.0 was released at the end of June. The release >>> included more than 75 resolved issues, comprising a large number of >>> new features as well as general improvements and bug-fixes. Among >>> others, Calcite was upgraded to use the recently released version of >>> Avatica. >>> >>> Our community continued growing this quarter: three new committers >>> (Slim Bouguerra, Kevin Liew, and Zhiqiang He) were added to the project. >>> >>> Finally, there was an important presence of the Apache Calcite project >>> in talks at multiple events, such as Apache: Big Data North America 2017 >>> (Miami, FL), PhoenixCon (San Francisco, CA) and >>> DataWorks Summit USA 2017 (San Jose, CA). >>> >>> ## Health report: >>> >>> Activity levels on mailing lists, git and JIRA are normal for both >>> Calcite and Avatica. >>> >>> ## PMC changes: >>> >>> - Currently 16 PMC members. >>> - No new PMC members added in the last 3 months >>> >>> ## Committer base changes: >>> >>> - Currently 25 committers. >>> - New commmitters: >>> - Slim Bouguerra was added as a committer on Sun Jun 18 2017 >>> - Kevin Liew was added as a committer on Sun Jun 18 2017 >>> - Zhiqiang He was added as a committer on Fri Jun 09 2017 >>> >>> ## Releases: >>> >>> - 1.13.0 was released on Mon Jun 26 2017 >>> - avatica-1.10.0 was released on Tue May 30 2017 >>> >>> ## JIRA activity: >>> >>> - 135 JIRA tickets created in the last 3 months >>> - 112 JIRA tickets closed/resolved in the last 3 months >>> >>> >
[DISCUSS] Draft board report
Calcite community, I attach the draft of the report I propose to file for the 7/19 Apache board meeting. Please, let me know if you have any feedback. Thanks, Jesús --- Attachment O: Report from the Apache Calcite Project [Jesús Camacho Rodríguez] ## Description: Apache Calcite is a highly customizable framework for parsing and planning queries on data in a wide variety of formats. It allows database-like access, and in particular a SQL interface and advanced query optimization, for data not residing in a traditional database. Avatica is a sub-module within Calcite, and provides a framework for building local and remote JDBC and ODBC database drivers. Avatica has an independent release schedule, and since April 2017, it has its own independent repository. ## Issues: - There are no issues requiring board attention at this time. ## Activity: Development and mailing list activity is steady for both Calcite and its Avatica sub-project. Since the last board meeting, there has been one Calcite release and one Avatica release. Avatica 1.10.0 was released at the end of May. As the Calcite and Avatica projects become more separate, this was the first release since Avatica’s git repository separated from Calcite’s repository during the previous quarter. The release added support for JDBC Array data, Docker, and JDK 9 (it continues to run on JDK 7 and 8). In total, there were over 20 new features and bug fixes. In turn, Calcite 1.13.0 was released at the end of June. The release included more than 75 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes. Among others, Calcite was upgraded to use the recently released version of Avatica. Our community continued growing this quarter: three new committers (Slim Bouguerra, Kevin Liew, and Zhiqiang He) were added to the project. Finally, there was an important presence of the Apache Calcite project in talks at multiple events, such as Apache: Big Data North America 2017 (Miami, FL), PhoenixCon (San Francisco, CA) and DataWorks Summit USA 2017 (San Jose, CA). ## Health report: Activity levels on mailing lists, git and JIRA are normal for both Calcite and Avatica. ## PMC changes: - Currently 16 PMC members. - No new PMC members added in the last 3 months ## Committer base changes: - Currently 25 committers. - New commmitters: - Slim Bouguerra was added as a committer on Sun Jun 18 2017 - Kevin Liew was added as a committer on Sun Jun 18 2017 - Zhiqiang He was added as a committer on Fri Jun 09 2017 ## Releases: - 1.13.0 was released on Mon Jun 26 2017 - avatica-1.10.0 was released on Tue May 30 2017 ## JIRA activity: - 135 JIRA tickets created in the last 3 months - 112 JIRA tickets closed/resolved in the last 3 months
[ANNOUNCE] Apache Calcite 1.13.0 released
The Apache Calcite team is pleased to announce the release of Apache Calcite 1.13.0. Calcite is a dynamic data management framework. Its cost-based optimizer converts queries, represented in relational algebra, into executable plans. Calcite supports many front-end languages and back-end data engines, and includes an SQL parser and, as a sub-project, the Avatica JDBC driver. This release comes three months after 1.12.0. It includes more than 75 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes. Among others, it includes the upgrade to Avatica 1.10.0, a new materialized view rewriting algorithm, extensions for streaming queries, improvements for the Druid adapter, and bug-fixes and general improvements for query planning. You can start using it in Maven by simply updating your dependency to: org.apache.calcite calcite-core 1.13.0 If you'd like to download the source release, you can find it here: http://www.apache.org/dyn/closer.cgi/calcite/apache-calcite-1.13.0/ You can read more about the release (including release notes) here: http://calcite.apache.org/news/2017/06/26/release-1.13.0/ We welcome your help and feedback. For more information on how to report problems, and to get involved, visit the project website at: http://calcite.apache.org/ Thanks to everyone involved! Jesus Camacho Rodriguez, on behalf of the Apache Calcite Team
[jira] [Created] (CALCITE-1859) NPE in validate method of VolcanoPlanner
Jesus Camacho Rodriguez created CALCITE-1859: Summary: NPE in validate method of VolcanoPlanner Key: CALCITE-1859 URL: https://issues.apache.org/jira/browse/CALCITE-1859 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.13.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Priority: Critical Fix For: 1.14.0 CALCITE-1812 introduced the following line in {{validate}} method in VolcanoPlanner: https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoPlanner.java#L891 {code} final RelMetadataQuery mq = root.getCluster().getMetadataQuery(); {code} {{validate}} might be called as part of the {{setRoot}} logic before _root_ is set, thus we are hitting a NPE. Workaround was easy as {{validate}} is only called in logging DEBUG level (I guess that is why we did not see this issue before), but this JIRA will fix the issue by retrieving the RelMetadataQuery in _validate_ only when needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: How to merge pull request in github?
If I understand your question right, you should probably rebase the PR before merging it into master. Then you should add to the commit message the reference to the PR, e.g., Close apache/calcite#xxx, and it will appear as merged. -Jesús On 6/23/17, 10:22 AM, "zhiqiang"wrote: >Hi all >How to merge pull requests in calcite github? >I saw all pull request was closed. but not merged in github. >All commits can be found in: >https://git1-us-west.apache.org/repos/asf?p=calcite.git;a=summary >but how to commit codes to apache git? > >Regards >Zhiqiang He >
[VOTE] Release apache-calcite-1.13.0 (release candidate 0)
Hi all, I have created a build for Apache Calcite 1.13.0, release candidate 0. Thanks to everyone who has contributed to this release. This release comes three months after 1.12.0. It includes more than 75 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes. Among others, it includes the upgrade to Avatica 1.10.0, a new materialized view rewriting algorithm, extensions for streaming queries, improvements for the Druid adapter, and bug-fixes and general improvements for query planning. You can read the release notes here: https://github.com/apache/calcite/blob/branch-1.13/site/_docs/history.md The commit to be voted upon: http://git-wip-us.apache.org/repos/asf/calcite/commit/54b9823 Its hash is 54b9823e7ca313bf195c19b7d98f1a06b342cf12. The artifacts to be voted on are located here: https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.13.0-rc0/ The hashes of the artifacts are as follows: apache-calcite-1.13.0-src.tar.gz:MD5 = 6D 9B 54 C4 6A 02 E7 A3 2C C8 7F 5A BF 72 F3 48 apache-calcite-1.13.0-src.tar.gz: SHA1 = B042 5838 5135 2531 3D43 027F 1D00 DE71 2B95 4A30 apache-calcite-1.13.0-src.tar.gz: RMD160 = F761 E856 8036 59A8 FA7D F439 2AFF 83B9 BF12 A296 apache-calcite-1.13.0-src.tar.gz: SHA224 = D78AE1B2 67B72EED 9ED0A2D0 76F6D863 EF4B2D10 AB168E10 968F39AA apache-calcite-1.13.0-src.tar.gz: SHA256 = EB7453F4 0D47FB4F 16052DAE 01564A86 A3C6B054 9B0533BF B1283C16 F5374894 apache-calcite-1.13.0-src.tar.gz: SHA384 = F6D2BD28 3F9A0E68 980CB04A FFC73FA0 4B1D41A8 747D885A 4F935C8E 2E1F8CAB CA75B3DD D3F9D03B FBA4A66A 5D956460 apache-calcite-1.13.0-src.tar.gz: SHA512 = A10D9740 7FC5A31A A36CC088 47E867DB 7375E422 5EAF507E 9C20D3FC B1AA9D1F CF904BF9 B5659406 8F6E88EC 82A5C862 EB59C614 F7491C1F D7B7B4E6 E49D7AF6 apache-calcite-1.13.0-src.zip:MD5 = 02 47 F8 08 29 87 DF 37 E8 43 9A 03 71 0D D1 42 apache-calcite-1.13.0-src.zip: SHA1 = 3ED5 2158 58FA 3580 78CA D01F B69C E1E9 A33C 5CC6 apache-calcite-1.13.0-src.zip: RMD160 = 02BA 4EB1 59A0 00D3 4506 A209 4000 CF10 57F2 DAC5 apache-calcite-1.13.0-src.zip: SHA224 = 954AFE86 D4531B3B 5FF7A741 92DAD116 D41B3D5C 3B22C931 A20C9434 apache-calcite-1.13.0-src.zip: SHA256 = 4B79F66C 87F5FACE 51EFACA4 B846FAD9 30B0EBDB E997DE09 26B7A430 A0A57D7A apache-calcite-1.13.0-src.zip: SHA384 = 028565D5 29E50683 CAB9565F 70B7643A 6B090CD9 EB45EC9D C3E57497 BC0D44AC 04EC0B14 51BB05CE A48A98A2 69B47CE4 apache-calcite-1.13.0-src.zip: SHA512 = 6FB5E301 59C5AA1A 42CF772B B18404DD 7CBCE15D D36669E2 8C39A322 15D14B89 6AB54593 2F2E64D2 5CDE64B2 1C664D40 AE6BC52A 9DE266A5 39D7B00A 8EE28FE4 A staged Maven repository is available for review at: https://repository.apache.org/content/repositories/orgapachecalcite-1036 Release artifacts are signed with the following key: https://people.apache.org/keys/committer/jcamacho.asc Please vote on releasing this package as Apache Calcite 1.13.0. The vote is open for the next 72 hours and passes if a majority of at least three +1 PMC votes are cast. [ ] +1 Release this package as Apache Calcite 1.13.0 [ ] 0 I don't feel strongly about it, but I'm okay with the release [ ] -1 Do not release this package because... Here is my vote: +1 (binding) Jesús
Re: [ANNOUNCE] New committer: Slim Bouguerra
Congrats Slim, welcome! -Jesús On 6/19/17, 2:34 AM, "zhiqiang"wrote: >Congratulations and welcome! > > > > >zhiqiang > > > > > > > >On Mon, Jun 19, 2017 at 12:04 AM +0800, "Julian" wrote: > > > > > > > > > > >Apache Calcite's Project Management Committee (PMC) has invited >Slim Bouguerra to become a committer, and we are pleased to >announce that he has accepted. > >Slim is an active and respected member of the Druid community, and has >done a lot of work to make Calcite's Druid adapter production quality. He is >also working on integrating Druid into Hive (leveraging the adapter), and it >has been great to see him helping other contributors, mentoring them and >reviewing their changes. > >Slim, thank you for your contributions, and we look forward your further >interactions with the community! If you wish, please feel free to tell us >more about yourself and what you are working on. > >Julian (on behalf of the Apache Calcite PMC)� > > > >