[jira] [Updated] (DRILL-1936) Throw an error if subquery in the where clause does not return scalar result

Victoria Markman (JIRA) Tue, 06 Jan 2015 11:07:58 -0800

     [ 
https://issues.apache.org/jira/browse/DRILL-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Victoria Markman updated DRILL-1936:
------------------------------------
    Component/s: Query Planning & Optimization
    Description: 
{code}
#Fri Jan 02 21:20:47 EST 2015
git.commit.id.abbrev=b491cdb
{code}

When result of a subquery is non scalar (regardless of if it is correlated or 
not) we should throw  an error either during planning time or during runtime 
when we know cardinality of the result set.

Currently, queries either fail to plan:

{code}
0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet`  where n_name 
= ( select r_name from cp.`tpch/region.parquet` where n_regionkey = 
r_regionkey);
Query failed: Query failed: Unexpected exception during fragment 
initialization: Node [rel#24659:Subset#7.LOGICAL.ANY([]).[]] could not be 
implemented; planner state:

Root: rel#24659:Subset#7.LOGICAL.ANY([]).[]
Original rel:
AbstractConverter(subset=[rel#24659:Subset#7.LOGICAL.ANY([]).[]], 
convention=[LOGICAL], DrillDistributionTraitDef=[ANY([])], sort=[[]]): rowcount 
= 1.7976931348623157E308, cumulative cost = {inf}, id = 24660
  ProjectRel(subset=[rel#24658:Subset#7.NONE.ANY([]).[]], *=[$0]): rowcount = 
1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 
1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24657
    FilterRel(subset=[rel#24656:Subset#6.NONE.ANY([]).[]], condition=[=($1, 
$2)]): rowcount = 2.6965397022934733E307, cumulative cost = 
{2.6965397022934733E307 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 
0.0 memory}, id = 24655
      JoinRel(subset=[rel#24654:Subset#5.NONE.ANY([]).[]], condition=[true], 
joinType=[left]): rowcount = 1.7976931348623157E308, cumulative cost = 
{1.7976931348623157E308 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 
24653
        
EnumerableTableAccessRel(subset=[rel#24645:Subset#0.ENUMERABLE.ANY([]).[]], 
table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = {100.0 
rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24623
        AggregateRel(subset=[rel#24652:Subset#4.NONE.ANY([]).[]], group=[{}], 
agg#0=[SINGLE_VALUE($0)]): rowcount = 1.7976931348623158E307, cumulative cost = 
{1.7976931348623158E307 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 
24651
          ProjectRel(subset=[rel#24650:Subset#3.NONE.ANY([]).[]], r_name=[$3]): 
rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 
rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24649
            FilterRel(subset=[rel#24648:Subset#2.NONE.ANY([]).[]], 
condition=[=($1, $2)]): rowcount = 15.0, cumulative cost = {15.0 rows, 100.0 
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24647
              
EnumerableTableAccessRel(subset=[rel#24646:Subset#1.ENUMERABLE.ANY([]).[]], 
table=[[cp, tpch/region.parquet]]): rowcount = 100.0, cumulative cost = {100.0 
rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24624
{code}

or return strange error messages that are difficult to decipher:

{code}
0: jdbc:drill:schema=dfs> select  a.emp_num,
. . . . . . . . . . . . >         a.emp_name
. . . . . . . . . . . . > from    `emp1.json` as a
. . . . . . . . . . . . > where   a.salary > (
. . . . . . . . . . . . >                 select  b.salary
. . . . . . . . . . . . >                 from    `emp1.json` b
. . . . . . . . . . . . >                 where   b.dept = a.dept)
. . . . . . . . . . . . > order by 1;
Query failed: Query failed: Failure while running fragment., Schema is 
currently null.  You must call buildSchema(SelectionVectorMode) before this 
container can return a schema. [ d800ab5d-aa5b-4371-8cb6-819dccca40aa on 
atsqa4-134.qa.lab:31010 ]
[ d800ab5d-aa5b-4371-8cb6-819dccca40aa on atsqa4-134.qa.lab:31010 ]
Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}


  was:
When result of a subquery is non scalar (regardless of if it is correlated or 
not) we should throw  an error either during planning time or during runtime 
when we know cardinality of the result set.

Currently, queries either fail to plan:

{code}
0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet`  where n_name 
= ( select r_name from cp.`tpch/region.parquet` where n_regionkey = 
r_regionkey);
Query failed: Query failed: Unexpected exception during fragment 
initialization: Node [rel#24659:Subset#7.LOGICAL.ANY([]).[]] could not be 
implemented; planner state:

Root: rel#24659:Subset#7.LOGICAL.ANY([]).[]
Original rel:
AbstractConverter(subset=[rel#24659:Subset#7.LOGICAL.ANY([]).[]], 
convention=[LOGICAL], DrillDistributionTraitDef=[ANY([])], sort=[[]]): rowcount 
= 1.7976931348623157E308, cumulative cost = {inf}, id = 24660
  ProjectRel(subset=[rel#24658:Subset#7.NONE.ANY([]).[]], *=[$0]): rowcount = 
1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 
1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24657
    FilterRel(subset=[rel#24656:Subset#6.NONE.ANY([]).[]], condition=[=($1, 
$2)]): rowcount = 2.6965397022934733E307, cumulative cost = 
{2.6965397022934733E307 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 
0.0 memory}, id = 24655
      JoinRel(subset=[rel#24654:Subset#5.NONE.ANY([]).[]], condition=[true], 
joinType=[left]): rowcount = 1.7976931348623157E308, cumulative cost = 
{1.7976931348623157E308 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 
24653
        
EnumerableTableAccessRel(subset=[rel#24645:Subset#0.ENUMERABLE.ANY([]).[]], 
table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = {100.0 
rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24623
        AggregateRel(subset=[rel#24652:Subset#4.NONE.ANY([]).[]], group=[{}], 
agg#0=[SINGLE_VALUE($0)]): rowcount = 1.7976931348623158E307, cumulative cost = 
{1.7976931348623158E307 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 
24651
          ProjectRel(subset=[rel#24650:Subset#3.NONE.ANY([]).[]], r_name=[$3]): 
rowcount = 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 
rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24649
            FilterRel(subset=[rel#24648:Subset#2.NONE.ANY([]).[]], 
condition=[=($1, $2)]): rowcount = 15.0, cumulative cost = {15.0 rows, 100.0 
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24647
              
EnumerableTableAccessRel(subset=[rel#24646:Subset#1.ENUMERABLE.ANY([]).[]], 
table=[[cp, tpch/region.parquet]]): rowcount = 100.0, cumulative cost = {100.0 
rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24624
{code}

or return strange error messages that are difficult to decipher:

{code}
0: jdbc:drill:schema=dfs> select  a.emp_num,
. . . . . . . . . . . . >         a.emp_name
. . . . . . . . . . . . > from    `emp1.json` as a
. . . . . . . . . . . . > where   a.salary > (
. . . . . . . . . . . . >                 select  b.salary
. . . . . . . . . . . . >                 from    `emp1.json` b
. . . . . . . . . . . . >                 where   b.dept = a.dept)
. . . . . . . . . . . . > order by 1;
Query failed: Query failed: Failure while running fragment., Schema is 
currently null.  You must call buildSchema(SelectionVectorMode) before this 
container can return a schema. [ d800ab5d-aa5b-4371-8cb6-819dccca40aa on 
atsqa4-134.qa.lab:31010 ]
[ d800ab5d-aa5b-4371-8cb6-819dccca40aa on atsqa4-134.qa.lab:31010 ]
Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}



> Throw an error if subquery in the where clause does not return scalar result
> ----------------------------------------------------------------------------
>
>                 Key: DRILL-1936
>                 URL: https://issues.apache.org/jira/browse/DRILL-1936
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>            Reporter: Victoria Markman
>
> {code}
> #Fri Jan 02 21:20:47 EST 2015
> git.commit.id.abbrev=b491cdb
> {code}
> When result of a subquery is non scalar (regardless of if it is correlated or 
> not) we should throw  an error either during planning time or during runtime 
> when we know cardinality of the result set.
> Currently, queries either fail to plan:
> {code}
> 0: jdbc:drill:schema=dfs> select * from cp.`tpch/nation.parquet`  where 
> n_name = ( select r_name from cp.`tpch/region.parquet` where n_regionkey = 
> r_regionkey);
> Query failed: Query failed: Unexpected exception during fragment 
> initialization: Node [rel#24659:Subset#7.LOGICAL.ANY([]).[]] could not be 
> implemented; planner state:
> Root: rel#24659:Subset#7.LOGICAL.ANY([]).[]
> Original rel:
> AbstractConverter(subset=[rel#24659:Subset#7.LOGICAL.ANY([]).[]], 
> convention=[LOGICAL], DrillDistributionTraitDef=[ANY([])], sort=[[]]): 
> rowcount = 1.7976931348623157E308, cumulative cost = {inf}, id = 24660
>   ProjectRel(subset=[rel#24658:Subset#7.NONE.ANY([]).[]], *=[$0]): rowcount = 
> 1.7976931348623157E308, cumulative cost = {1.7976931348623157E308 rows, 
> 1.7976931348623157E308 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24657
>     FilterRel(subset=[rel#24656:Subset#6.NONE.ANY([]).[]], condition=[=($1, 
> $2)]): rowcount = 2.6965397022934733E307, cumulative cost = 
> {2.6965397022934733E307 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 
> network, 0.0 memory}, id = 24655
>       JoinRel(subset=[rel#24654:Subset#5.NONE.ANY([]).[]], condition=[true], 
> joinType=[left]): rowcount = 1.7976931348623157E308, cumulative cost = 
> {1.7976931348623157E308 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 
> 24653
>         
> EnumerableTableAccessRel(subset=[rel#24645:Subset#0.ENUMERABLE.ANY([]).[]], 
> table=[[cp, tpch/nation.parquet]]): rowcount = 100.0, cumulative cost = 
> {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24623
>         AggregateRel(subset=[rel#24652:Subset#4.NONE.ANY([]).[]], group=[{}], 
> agg#0=[SINGLE_VALUE($0)]): rowcount = 1.7976931348623158E307, cumulative cost 
> = {1.7976931348623158E307 rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id 
> = 24651
>           ProjectRel(subset=[rel#24650:Subset#3.NONE.ANY([]).[]], 
> r_name=[$3]): rowcount = 1.7976931348623157E308, cumulative cost = 
> {1.7976931348623157E308 rows, 1.7976931348623157E308 cpu, 0.0 io, 0.0 
> network, 0.0 memory}, id = 24649
>             FilterRel(subset=[rel#24648:Subset#2.NONE.ANY([]).[]], 
> condition=[=($1, $2)]): rowcount = 15.0, cumulative cost = {15.0 rows, 100.0 
> cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24647
>               
> EnumerableTableAccessRel(subset=[rel#24646:Subset#1.ENUMERABLE.ANY([]).[]], 
> table=[[cp, tpch/region.parquet]]): rowcount = 100.0, cumulative cost = 
> {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 24624
> {code}
> or return strange error messages that are difficult to decipher:
> {code}
> 0: jdbc:drill:schema=dfs> select  a.emp_num,
> . . . . . . . . . . . . >         a.emp_name
> . . . . . . . . . . . . > from    `emp1.json` as a
> . . . . . . . . . . . . > where   a.salary > (
> . . . . . . . . . . . . >                 select  b.salary
> . . . . . . . . . . . . >                 from    `emp1.json` b
> . . . . . . . . . . . . >                 where   b.dept = a.dept)
> . . . . . . . . . . . . > order by 1;
> Query failed: Query failed: Failure while running fragment., Schema is 
> currently null.  You must call buildSchema(SelectionVectorMode) before this 
> container can return a schema. [ d800ab5d-aa5b-4371-8cb6-819dccca40aa on 
> atsqa4-134.qa.lab:31010 ]
> [ d800ab5d-aa5b-4371-8cb6-819dccca40aa on atsqa4-134.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-1936) Throw an error if subquery in the where clause does not return scalar result

Reply via email to