Re: Materialized View Planner; How to rewrite query to use materialized view

Justin So Mon, 28 Oct 2024 23:53:48 -0700

Thank you so much for your response @suibianwanwan33! Sorry I can't seem to
figure out how to reply to these Apache emails since I keep getting error
400 messages when I click "Reply with your own email client" on the web
interface.


Hi, I didn't look at the code specifically, as far as I know, there are two
> types of materialized rewrites for calcite, one based on
> MaterializedViewRules and one based on SubstitutionRule rewrites. There are
> some differences in the implementation and rewriting capabilities of these
> two rewrites. If you don't add any rules about materialized rewriting but
> implement rewriting, it's probably SubstitutionRule implementation. You can
> read the SubstitutionVisitor code for this part. I hope it helps you.


I looked at VolcanoPlanner class and found registerMaterializations() is
using the SubstitutionVisitor code and is the reason why VolcanoPlanner can
use materialized views and HepPlanner is unable to since HepPlanner does
not have a similar implementation of registerMaterializations().

More specifically on why HepPlanner is unable to use Materialized Views
without rewriting the internals, VolcanoPlanner uses
SubstitutionVisitor.DEFAULT_RULES to rewrite the query to utilize the
materialized view and I am unable to add SubstitutionVisitor.DEFAULT_RULES
to the HepPlanner via HepProgramBuilder.addRuleInstance().

I am attempting to use MaterializedViewRules in my code, however, I am
unable to get the rules to work. I read over the code in
MaterializedViewRelOptRulesTest to utilize a materialized view, but am
still failing to do so after adding in the MaterializedViewsRules rules in
the above code. Can anyone point me in the right direction to make the
planner utilize any of the rules in MaterializedViewRules?

Thanks,
Justin So



On Wed, Oct 23, 2024 at 2:04 AM Justin So <[email protected]> wrote:

> Hi all,
>
> Update on my progress with some issues:
>
> After copying the code from
> https://www.querifylabs.com/blog/assembling-a-query-optimizer-with-apache-calcite
>  to
> setup the necessary tables, schema and optimizer, I was able to rewrite a
> query to utilize an MV with the following code:
>
>         SimpleTable lineitem = SimpleTable.newBuilder("lineitem")
>                 .addField("l_quantity", SqlTypeName.DECIMAL)
>                 .addField("l_extendedprice", SqlTypeName.DECIMAL)
>                 .addField("l_discount", SqlTypeName.DECIMAL)
>                 .addField("l_shipdate", SqlTypeName.DATE)
>                 .addField("l_orderkey", SqlTypeName.INTEGER)
>                 .addField("l_tax", SqlTypeName.DECIMAL)
>                 .withRowCount(60_000L)
>                 .build();
>
>         SimpleTable mv = SimpleTable.newBuilder("mv0")
>                 .addField("l_discount", SqlTypeName.DECIMAL)
>                 .addField("l_tax", SqlTypeName.DECIMAL)
>                 .withRowCount(3_000L)
>                 .build();
>
>         SimpleSchema schema = 
> SimpleSchema.newBuilder("tpch").addTable(lineitem).addTable(mv).build();
>
>         Optimizer optimizer = Optimizer.create(schema);
>
>         String sql = "select l_discount, sum(l_tax)\n" +
>                 "from lineitem\n" +
>                 "where l_discount > 100 group by l_discount";
>         SqlNode sqlTree = optimizer.parse(sql);
>         SqlNode validatedSqlTree = optimizer.validate(sqlTree);
>         RelNode queryRel = optimizer.convert(validatedSqlTree);
>
>         String mvSQL = "select * from mv0";
>         SqlNode mvsqlTree = optimizer.parse(mvSQL);
>         SqlNode mvvalidatedSqlTree = optimizer.validate(mvsqlTree);
>         RelNode mvtableRel = optimizer.convert(mvvalidatedSqlTree);
>
>         System.out.println(queryRel.explain());
>         /*
>         LogicalAggregate(group=[{0}], EXPR$1=[SUM($1)])
>             LogicalProject(l_discount=[$2], l_tax=[$5])
>                 LogicalFilter(condition=[>($2, 100)])
>                     LogicalTableScan(table=[[tpch, lineitem]])
>          */
>         RelOptMaterialization mat1 =
>                 new RelOptMaterialization(mvtableRel, queryRel, null,
>                         ImmutableList.of("default", "mv"));
>         optimizer.planner.addMaterialization(mat1);
>
>         RuleSet rules = RuleSets.ofList(
>                 CoreRules.FILTER_TO_CALC,
>                 CoreRules.PROJECT_TO_CALC,
>                 CoreRules.FILTER_CALC_MERGE,
>                 CoreRules.PROJECT_CALC_MERGE,
>                 EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE,
>                 EnumerableRules.ENUMERABLE_PROJECT_RULE,
>                 EnumerableRules.ENUMERABLE_FILTER_RULE,
>                 EnumerableRules.ENUMERABLE_CALC_RULE,
>                 EnumerableRules.ENUMERABLE_AGGREGATE_RULE
>         );
>
>         Program program = Programs.of(RuleSets.ofList(rules));
>
>         RelNode optimizerRelTree = program.run(
>                 optimizer.planner,
>                 queryRel,
>                 queryRel.getTraitSet().plus(EnumerableConvention.INSTANCE),
>                 List.of(mat1),
>                 Collections.emptyList()
>         );
>         System.out.println(optimizerRelTree.explain());
>         /*
>         EnumerableProject(l_discount=[$0], EXPR$1=[$1])
>             EnumerableTableScan(table=[[tpch, mv0]])
>          */
>
>
> These are my current issues:
> 1. My query is being rewritten to use a materialized view, however, I did
> not set any rules from MaterializedViewsRules. How do I make my code use
> MaterializedViewsRules?
> 2. The current optimizer uses the VolcanoPlanner, and when I change it to
> use HepPlanner, I get an error relating to null root, specifically
> java.lang.NullPointerException: root
>
> at java.base/java.util.Objects.requireNonNull(Objects.java:246)
> at org.apache.calcite.plan.hep.HepPlanner.changeTraits(HepPlanner.java:182)
> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:325)
> at test.OptimizerTest.test_tpch_q6_materialization(OptimizerTest.java:158)
> at java.base/java.lang.reflect.Method.invoke(Method.java:580)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
>
> Not sure why this is happening because in Programs.run() it sets the root
> in the planner.
>
> Thanks,
> Justin So
>
> On Mon, Oct 21, 2024 at 8:00 AM Justin So <[email protected]> wrote:
>
>> Hi all,
>> Hopefully this is the right place to direct my questions. If not could
>> someone direct me towards the right place.
>>
>> I am currently exploring creating a Materialized View planner rule and am
>> running into issues on how to create a materialized view in Calcite and
>> utilize the current materialized view planner rules for a query workload.
>> The Apache Calcite docs
>> <https://calcite.apache.org/docs/materialized_views.html> only show what
>> it can do, but not the actual code on how to get started with using
>> materialized views or creating materialized views. Specifically I want to
>> integrate the materialized view rules under
>> https://calcite.apache.org/docs/materialized_views.html#rewriting-using-plan-structural-information
>>  into
>> a planner (e.g Volcano or HepPlanner). I tried looking into the codebase,
>> but it is so large I am not sure where to start looking, especially with
>> the many Materialized[a-z]+ classes. Could someone point me to any code
>> examples that utilize materialized views or at least in the right direction?
>>
>> Is there a way to automatically discover materialized view candidates
>> that is not through creating a model.json file and modifying it like in
>> https://calcite.apache.org/docs/lattice.html.
>>
>> Thanks,
>> Justin So
>>
>>
>>
>>
>>

Re: Materialized View Planner; How to rewrite query to use materialized view

Reply via email to