Re: Is it possible to retrieve materialized view query rewrite from Calcite Plan
I think it will work. I once did similar things and was playing around with VolcanoPlanner and RelOptMaterializations. Shubham Kumar 于2019年9月16日周一 下午8:32写道: > Hey everyone, > > Thank you for the help. > > @Stamatis, @Danny : Thanks for pointing out that the reltosql api should > be used after volcano planner execution which considers materializations > and lattices, I understand it now. > > @Xing: Yeah, it works perfectly fine, I am able to see simple substitutions > in action done by Calcite Planner! > > Although I guess "*org.apache.calcite.tes*t" API can't be accessed in some > other project. To extract out the materialized query, I was playing around > with Prepare and optimize of Calcite Prepare class[1]. Let me know if I am > moving in the correct direction. > > Since, now I am able to get the rewrite query, I wanted to share the exact > use case which I am trying to implement and get your insights: > > I have explicitly defined materialized views which are stored in data > source and defining it in *root.schema.materializations *of model file > (keeping the "*view*" field empty as its already exists). Now, I will be > completely responsible for maintaining this materialized view table in my > datasource as various datasources don't have built in Materialized View > like Mysql etc. Finally, I aim the rewrite query that I retrieve from > Calcite to execute it using Spark SQL (data of tables will be stored in > HDFS). So basically query rewrite of Calcite will be a layer before Spark > SQL and it uses the optimized query using materialized views for making the > actual Spark plan and executing it in a distributed manner. > > I would love to have your insights on this on whether this is will be > feasible and if it would improve query performance. > > Problems that I am facing currently is that while using " > *org.apache.calcite.adapter.jdbc.JdbcSchema$Factory*" as the factory method > I get the following exception: > > > *Cannot define materialization; parent schema 'test_calcite' is not a > SemiMutableSchema* > > Currently I am using CloneSchema$Factory for testing things out but I guess > it maintains in memory tables which eventually I won't be able to use in > case of large tables. I was wondering if I would have to write my own > Factory classes and implementations for my use case or is there something > already present. > > [1] > > > https://github.com/apache/calcite/blob/73023148e7f37d494f6caf92b01b090f6dde13cd/core/src/main/java/org/apache/calcite/prepare/Prepare.java#L320 > > Thanks > Shubham > > > On Mon, Sep 16, 2019 at 11:32 AM XING JIN wrote: > > > Hi, Shubham Kumar > > If I understand correctly, you want get the optimized(by materialized > view) > > SQL String. I wrote a simple test as below, please check if it's helpful > > for you. > > > > @Test public void testDEV() { > > final String m = "select \"deptno\", \"empid\", \"name\"" > > + "from \"emps\" where \"deptno\" = 10"; > > final String q = "select \"empid\" + 1 as x, \"name\"" > > + "from \"emps\" where \"deptno\" = 10"; > > > > CalciteAssert.that() > > .withMaterializations(HR_FKUK_MODEL, "m0", m) > > .query(q) > > .withHook(Hook.SUB, (Consumer) r -> > > { > > RelToSqlConverter converter = > > new RelToSqlConverter(CalciteSqlDialect.DEFAULT); > > final SqlNode sqlNode = converter.visitChild(0, r).asStatement(); > > > > > System.out.println(sqlNode.toSqlString(CalciteSqlDialect.DEFAULT).getSql()); > > }) > > .enableMaterializations(true) > > .explainContains("hr, m0"); > > } > > > > The output is as below: > > > > SELECT "empid" + 1 AS "X", "name" > > FROM "hr"."m0" > > > > > > Danny Chan 于2019年9月16日周一 下午1:44写道: > > > > > Hi, Shubham Kumar ~ > > > > > > > However, I wanted the optimized rewrite SQL query if possible, not > just > > > the > > > > plan. So far, I tried to use the rel2sql api but when I print > > > > RelNode.toString(), it gives me a plan which involves scanning the > raw > > > > tables and hence the SQL generated by rel2sql is not the one which > > > utilized > > > > the materialized view. Any pointers to get the rewrite query. > > > > > > Did you try the RelNode.toString() after the volcano planner promotion, > > it > > > is not expected to involve scanning the raw tables. > > > > > > Best, > > > Danny Chan > > > 在 2019年9月11日 +0800 PM9:42,dev@calcite.apache.org,写道: > > > > > > > > However, I wanted the optimized rewrite SQL query if possible, not > just > > > the > > > > plan. So far, I tried to use the rel2sql api but when I print > > > > RelNode.toString(), it gives me a plan which involves scanning the > raw > > > > tables and hence the SQL generated by rel2sql is not the one which > > > utilized > > > > the materialized view. Any pointers to get the rewrite query. > > > > > > > > -- > Thanks & Regards > > Shubham Kumar >
Re: Is it possible to retrieve materialized view query rewrite from Calcite Plan
Hey everyone, Thank you for the help. @Stamatis, @Danny : Thanks for pointing out that the reltosql api should be used after volcano planner execution which considers materializations and lattices, I understand it now. @Xing: Yeah, it works perfectly fine, I am able to see simple substitutions in action done by Calcite Planner! Although I guess "*org.apache.calcite.tes*t" API can't be accessed in some other project. To extract out the materialized query, I was playing around with Prepare and optimize of Calcite Prepare class[1]. Let me know if I am moving in the correct direction. Since, now I am able to get the rewrite query, I wanted to share the exact use case which I am trying to implement and get your insights: I have explicitly defined materialized views which are stored in data source and defining it in *root.schema.materializations *of model file (keeping the "*view*" field empty as its already exists). Now, I will be completely responsible for maintaining this materialized view table in my datasource as various datasources don't have built in Materialized View like Mysql etc. Finally, I aim the rewrite query that I retrieve from Calcite to execute it using Spark SQL (data of tables will be stored in HDFS). So basically query rewrite of Calcite will be a layer before Spark SQL and it uses the optimized query using materialized views for making the actual Spark plan and executing it in a distributed manner. I would love to have your insights on this on whether this is will be feasible and if it would improve query performance. Problems that I am facing currently is that while using " *org.apache.calcite.adapter.jdbc.JdbcSchema$Factory*" as the factory method I get the following exception: *Cannot define materialization; parent schema 'test_calcite' is not a SemiMutableSchema* Currently I am using CloneSchema$Factory for testing things out but I guess it maintains in memory tables which eventually I won't be able to use in case of large tables. I was wondering if I would have to write my own Factory classes and implementations for my use case or is there something already present. [1] https://github.com/apache/calcite/blob/73023148e7f37d494f6caf92b01b090f6dde13cd/core/src/main/java/org/apache/calcite/prepare/Prepare.java#L320 Thanks Shubham On Mon, Sep 16, 2019 at 11:32 AM XING JIN wrote: > Hi, Shubham Kumar > If I understand correctly, you want get the optimized(by materialized view) > SQL String. I wrote a simple test as below, please check if it's helpful > for you. > > @Test public void testDEV() { > final String m = "select \"deptno\", \"empid\", \"name\"" > + "from \"emps\" where \"deptno\" = 10"; > final String q = "select \"empid\" + 1 as x, \"name\"" > + "from \"emps\" where \"deptno\" = 10"; > > CalciteAssert.that() > .withMaterializations(HR_FKUK_MODEL, "m0", m) > .query(q) > .withHook(Hook.SUB, (Consumer) r -> > { > RelToSqlConverter converter = > new RelToSqlConverter(CalciteSqlDialect.DEFAULT); > final SqlNode sqlNode = converter.visitChild(0, r).asStatement(); > > System.out.println(sqlNode.toSqlString(CalciteSqlDialect.DEFAULT).getSql()); > }) > .enableMaterializations(true) > .explainContains("hr, m0"); > } > > The output is as below: > > SELECT "empid" + 1 AS "X", "name" > FROM "hr"."m0" > > > Danny Chan 于2019年9月16日周一 下午1:44写道: > > > Hi, Shubham Kumar ~ > > > > > However, I wanted the optimized rewrite SQL query if possible, not just > > the > > > plan. So far, I tried to use the rel2sql api but when I print > > > RelNode.toString(), it gives me a plan which involves scanning the raw > > > tables and hence the SQL generated by rel2sql is not the one which > > utilized > > > the materialized view. Any pointers to get the rewrite query. > > > > Did you try the RelNode.toString() after the volcano planner promotion, > it > > is not expected to involve scanning the raw tables. > > > > Best, > > Danny Chan > > 在 2019年9月11日 +0800 PM9:42,dev@calcite.apache.org,写道: > > > > > > However, I wanted the optimized rewrite SQL query if possible, not just > > the > > > plan. So far, I tried to use the rel2sql api but when I print > > > RelNode.toString(), it gives me a plan which involves scanning the raw > > > tables and hence the SQL generated by rel2sql is not the one which > > utilized > > > the materialized view. Any pointers to get the rewrite query. > > > -- Thanks & Regards Shubham Kumar
Re: Is it possible to retrieve materialized view query rewrite from Calcite Plan
Hi, Shubham Kumar If I understand correctly, you want get the optimized(by materialized view) SQL String. I wrote a simple test as below, please check if it's helpful for you. @Test public void testDEV() { final String m = "select \"deptno\", \"empid\", \"name\"" + "from \"emps\" where \"deptno\" = 10"; final String q = "select \"empid\" + 1 as x, \"name\"" + "from \"emps\" where \"deptno\" = 10"; CalciteAssert.that() .withMaterializations(HR_FKUK_MODEL, "m0", m) .query(q) .withHook(Hook.SUB, (Consumer) r -> { RelToSqlConverter converter = new RelToSqlConverter(CalciteSqlDialect.DEFAULT); final SqlNode sqlNode = converter.visitChild(0, r).asStatement(); System.out.println(sqlNode.toSqlString(CalciteSqlDialect.DEFAULT).getSql()); }) .enableMaterializations(true) .explainContains("hr, m0"); } The output is as below: SELECT "empid" + 1 AS "X", "name" FROM "hr"."m0" Danny Chan 于2019年9月16日周一 下午1:44写道: > Hi, Shubham Kumar ~ > > > However, I wanted the optimized rewrite SQL query if possible, not just > the > > plan. So far, I tried to use the rel2sql api but when I print > > RelNode.toString(), it gives me a plan which involves scanning the raw > > tables and hence the SQL generated by rel2sql is not the one which > utilized > > the materialized view. Any pointers to get the rewrite query. > > Did you try the RelNode.toString() after the volcano planner promotion, it > is not expected to involve scanning the raw tables. > > Best, > Danny Chan > 在 2019年9月11日 +0800 PM9:42,dev@calcite.apache.org,写道: > > > > However, I wanted the optimized rewrite SQL query if possible, not just > the > > plan. So far, I tried to use the rel2sql api but when I print > > RelNode.toString(), it gives me a plan which involves scanning the raw > > tables and hence the SQL generated by rel2sql is not the one which > utilized > > the materialized view. Any pointers to get the rewrite query. >
Re: Is it possible to retrieve materialized view query rewrite from Calcite Plan
Hi, Shubham Kumar ~ > However, I wanted the optimized rewrite SQL query if possible, not just the > plan. So far, I tried to use the rel2sql api but when I print > RelNode.toString(), it gives me a plan which involves scanning the raw > tables and hence the SQL generated by rel2sql is not the one which utilized > the materialized view. Any pointers to get the rewrite query. Did you try the RelNode.toString() after the volcano planner promotion, it is not expected to involve scanning the raw tables. Best, Danny Chan 在 2019年9月11日 +0800 PM9:42,dev@calcite.apache.org,写道: > > However, I wanted the optimized rewrite SQL query if possible, not just the > plan. So far, I tried to use the rel2sql api but when I print > RelNode.toString(), it gives me a plan which involves scanning the raw > tables and hence the SQL generated by rel2sql is not the one which utilized > the materialized view. Any pointers to get the rewrite query.
Re: Is it possible to retrieve materialized view query rewrite from Calcite Plan
Hi Shubham, View based rewriting is performed during planning (check the call hierarchy of [1] for more details). If you obtain a plan (RelNode) after the planner then most likely rel2sql API should give you the right SQL string. On the other hand if you are just using sql2rel API then rewritings are not considered. Best, Stamatis [1] https://github.com/apache/calcite/blob/d3c718328d4c83fb24007c8349b31b420187/core/src/main/java/org/apache/calcite/plan/RelOptPlanner.java#L177 On Wed, Sep 11, 2019 at 3:42 PM Shubham Kumar wrote: > Hi Contributors, > > I have recently started to experiment with Apache Calcite's materialized > views and I needed a bit of help. > > I have explicitly defined materialized views which are stored in data > source by defining it in root.schema.materializations of model file. > > Now for a query which should use the Materialized View: > > Explain plan for "query" gives a plan which shows that the MV is being > utilized. > > However, I wanted the optimized rewrite SQL query if possible, not just the > plan. So far, I tried to use the rel2sql api but when I print > RelNode.toString(), it gives me a plan which involves scanning the raw > tables and hence the SQL generated by rel2sql is not the one which utilized > the materialized view. Any pointers to get the rewrite query. > > -- > Thanks & Regards > > Shubham Kumar >
Is it possible to retrieve materialized view query rewrite from Calcite Plan
Hi Contributors, I have recently started to experiment with Apache Calcite's materialized views and I needed a bit of help. I have explicitly defined materialized views which are stored in data source by defining it in root.schema.materializations of model file. Now for a query which should use the Materialized View: Explain plan for "query" gives a plan which shows that the MV is being utilized. However, I wanted the optimized rewrite SQL query if possible, not just the plan. So far, I tried to use the rel2sql api but when I print RelNode.toString(), it gives me a plan which involves scanning the raw tables and hence the SQL generated by rel2sql is not the one which utilized the materialized view. Any pointers to get the rewrite query. -- Thanks & Regards Shubham Kumar