Re: Need help: exception when using Elasticsearch adapter

Alessandro Solimando Wed, 29 Sep 2021 23:14:29 -0700

Each adapter "speaks" the language of the supported data source, but only
to the extent of querying it, and for the tables stored in it.


If your input tables are not from ES I have a hard time to even understand
what you are trying to achieve in this way.

Consider also that a plan consists generally of operations that are not
always supported in all data sources, that's why when querying data the
adapter "knows" if a filter/join/etc. can be pushed-down/expressed in the
data source query language, if not they will be executed in the Enumerable
convention.

Can you elaborate a bit more on what you are trying to achieve?

Il Gio 30 Set 2021, 04:13 Justin Huang <yohu...@nvidia.com.invalid> ha
scritto:

> I can convert the Logical RelNode to Enumerable RelNode, and then I added
> another step to convert Enumerable RelNode to Elasticsearch RelNode:
>
> def toEsPhysicalPlan(root):
>     print(f"toEsPhysicalPlan root={root}")
>     planner = root.getCluster().getPlanner()
>     planner.clear()
>     for rule in ElasticsearchRules.RULES:
>         planner.addRule(rule)
>
>     program = Programs.of(RuleSets.ofList(planner.getRules()))
>     traits = root.getTraitSet().replace(ElasticsearchRel.CONVENTION)
> return program.run(planner, root, traits, ImmutableList.of(),
> ImmutableList.of())
>
> Now the program.run failed with below exception, what can be the reason?
> Is there any example that shows how to convert a SQL to Elasticsearch Json?
> BTW, my input schema/table is not of ElasticsearchSchema/Table type.
>
> org.apache.calcite.plan.RelOptPlanner.CannotPlanException:
> org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not
> enough rules to produce a node with desired properties:
> convention=ELASTICSEARCH, sort=[].
> Missing conversion is EnumerableProject[convention: ENUMERABLE ->
> ELASTICSEARCH]
> There is 1 empty subset: rel#67:RelSubset#3.ELASTICSEARCH.[], the relevant
> part of the original plan is as follows
> 65:EnumerableProject(BugId=[$0], ARB=[$1])
>   60:EnumerableTableScan(subset=[rel#64:RelSubset#2.ENUMERABLE.[]],
> table=[[root, table_bugx]])
>
> Root: rel#67:RelSubset#3.ELASTICSEARCH.[]
> Original rel:
> EnumerableProject(subset=[rel#58:RelSubset#1.ENUMERABLE.[]], BugId=[$0],
> ARB=[$1]): rowcount = 100.0, cumulative cost = {100.0 rows, 200.0 cpu, 0.0
> io}, id = 62
>   EnumerableTableScan(subset=[rel#61:RelSubset#0.ENUMERABLE.[]],
> table=[[root, table_bugx]]): rowcount = 100.0, cumulative cost = {100.0
> rows, 101.0 cpu, 0.0 io}, id = 60
>
> Sets:
> Set#2, type: RecordType(BIGINT BugId, VARCHAR ARB)
>         rel#64:RelSubset#2.ENUMERABLE.[], best=rel#60
>                 rel#60:EnumerableTableScan.ENUMERABLE.[](table=[root,
> table_bugx]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0
> io}
> Set#3, type: RecordType(BIGINT BugId, VARCHAR ARB)
>         rel#66:RelSubset#3.ENUMERABLE.[], best=rel#65
>
> rel#65:EnumerableProject.ENUMERABLE.[](input=RelSubset#64,inputs=0..1),
> rowcount=100.0, cumulative cost={200.0 rows, 301.0 cpu, 0.0 io}
>         rel#67:RelSubset#3.ELASTICSEARCH.[], best=null
>
> rel#68:AbstractConverter.ELASTICSEARCH.[](input=RelSubset#66,convention=ELASTICSEARCH,sort=[]),
> rowcount=100.0, cumulative cost={inf}
>
> Graphviz:
> digraph G {
>         root [style=filled,label="Root"];
>         subgraph cluster2{
>                 label="Set 2 RecordType(BIGINT BugId, VARCHAR ARB)";
>                 rel60 [label="rel#60:EnumerableTableScan\ntable=[root,
> table_bugx]\nrows=100.0, cost={100.0 rows, 101.0 cpu, 0.0
> io}",color=blue,shape=box]
>                 subset64 [label="rel#64:RelSubset#2.ENUMERABLE.[]"]
>         }
>         subgraph cluster3{
>                 label="Set 3 RecordType(BIGINT BugId, VARCHAR ARB)";
>                 rel65
> [label="rel#65:EnumerableProject\ninput=RelSubset#64,inputs=0..1\nrows=100.0,
> cost={200.0 rows, 301.0 cpu, 0.0 io}",color=blue,shape=box]
>                 rel68
> [label="rel#68:AbstractConverter\ninput=RelSubset#66,convention=ELASTICSEARCH,sort=[]\nrows=100.0,
> cost={inf}",shape=box]
>                 subset66 [label="rel#66:RelSubset#3.ENUMERABLE.[]"]
>                 subset67
> [label="rel#67:RelSubset#3.ELASTICSEARCH.[]",color=red]
>         }
>         root -> subset67;
>         subset64 -> rel60[color=blue];
>         subset66 -> rel65[color=blue]; rel65 -> subset64[color=blue];
>         subset67 -> rel68; rel68 -> subset66;
> }
>
> Thanks,
> Jusitn
>
> From: Justin Huang
> Sent: Monday, September 27, 2021 1:56 PM
> To: dev@calcite.apache.org
> Subject: Need help: exception when using Elasticsearch adapter
>
> Hi Calcite developers,
>
> I tried to convert a simple 'SELECT * FROM xxx' statement to Elasticsearch
> DSL using the Elasticsearch adapter.
>
> The toPhysicalPlan(relNode) failed when running program.run. Can anyone
> shed some lights on this? Any requirements on the Table/Schema? The table I
> used in my class is a DaskTable <
> https://github.com/dask-contrib/dask-sql/blob/main/planner/src/main/java/com/dask/sql/schema/DaskTable.java>
> (from dask-sql project) which implements the interface
> ProjectableFilterableTable.
>
> Thanks,
> Justin
>
> I am using Calcite in Python using with Jpype module, and the function
> sql2dsl converts RelNode to Elasticsearch DSL:
>
> def toPhysicalPlan(root):
>     rules = (
>                 EnumerableRules.ENUMERABLE_PROJECT_RULE,
>                 EnumerableRules.ENUMERABLE_FILTER_RULE,
>                 EnumerableRules.ENUMERABLE_AGGREGATE_RULE,
>                 EnumerableRules.ENUMERABLE_SORT_RULE,
>                 EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE
>             )
>
>     planner = root.getCluster().getPlanner()
>     planner.clear()
>     for rule in rules:
>         planner.addRule(rule)
>
>     program = Programs.of(RuleSets.ofList(planner.getRules()))
>     traits = root.getTraitSet().replace(EnumerableConvention.INSTANCE)
>     return program.run(planner, root, traits, ImmutableList.of(),
> ImmutableList.of())
>
> def toElasticsearchQuery(root):
>     relImplementor =
> EnumerableRelImplementor(root.getCluster().getRexBuilder(),
> ImmutableMap.of())
>     elasticsearchImplementor = ElasticsearchRel.Implementor()
>     elasticsearchImplementor.visitChild(0, root)
>     for x in elasticsearchImplementor.list:
>         print(f"x: {x}")
>
> def sql2dsl(relNode):
>     esPhysicalPlan = toPhysicalPlan(relNode)
>     esJson = toElasticsearchQuery(esPhysicalPlan)
>     print(f"esJson={str(esJson)}")
>
> Here are the error messages:
>
> org.apache.calcite.plan.RelOptPlanner.CannotPlanException:
> org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not
> enough rules to produce a node with desired properties:
> convention=ENUMERABLE, sort=[].
> Missing conversion is LogicalTableScan[convention: NONE -> ENUMERABLE]
> There is 1 empty subset: rel#51:RelSubset#0.ENUMERABLE.[], the relevant
> part of the original plan is as follows
> 47:LogicalTableScan(table=[[xxx]])
>
> Root: rel#51:RelSubset#0.ENUMERABLE.[]
> Original rel:
> LogicalTableScan(subset=[rel#51:RelSubset#0.ENUMERABLE.[]],
> table=[[xxx]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu,
> 0.0 io}, id = 47
>
> Sets:
> Set#0, type: RecordType(VARCHAR name, BIGINT age, VARCHAR position, BIGINT
> BugId)
>         rel#50:RelSubset#0.NONE.[], best=null
>                 rel#47:LogicalTableScan.NONE.[](table=[xxx]),
> rowcount=100.0, cumulative cost={inf}
>         rel#51:RelSubset#0.ENUMERABLE.[], best=null
>
> rel#52:AbstractConverter.ENUMERABLE.[](input=RelSubset#50,convention=ENUMERABLE,sort=[]),
> rowcount=100.0, cumulative cost={inf}
>
> Graphviz:
> digraph G {
>         root [style=filled,label="Root"];
>         subgraph cluster0{
>                 label="Set 0 RecordType(VARCHAR name, BIGINT age, VARCHAR
> position, BIGINT BugId)";
>                 rel47
> [label="rel#47:LogicalTableScan\ntable=[xxx]\nrows=100.0,
> cost={inf}",shape=box]
>                 rel52
> [label="rel#52:AbstractConverter\ninput=RelSubset#50,convention=ENUMERABLE,sort=[]\nrows=100.0,
> cost={inf}",shape=box]
>                 subset50 [label="rel#50:RelSubset#0.NONE.[]"]
>                 subset51
> [label="rel#51:RelSubset#0.ENUMERABLE.[]",color=red]
>         }
>         root -> subset51;
>         subset50 -> rel47;
>         subset51 -> rel52; rel52 -> subset50;
> }
>

Re: Need help: exception when using Elasticsearch adapter

Reply via email to