[jira] [Updated] (DRILL-2602) Throw an error on schema change during streaming aggregation

Deneche A. Hakim (JIRA) Fri, 08 May 2015 09:56:45 -0700

     [ 
https://issues.apache.org/jira/browse/DRILL-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Deneche A. Hakim updated DRILL-2602:
------------------------------------
    Attachment: DRILL-2602.4.patch.txt

updated StreamingAggBatch to throw a proper user exception

all unit tests are passing along with customer/tpch

> Throw an error on schema change during streaming aggregation
> ------------------------------------------------------------
>
>                 Key: DRILL-2602
>                 URL: https://issues.apache.org/jira/browse/DRILL-2602
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Relational Operators
>    Affects Versions: 0.8.0
>            Reporter: Victoria Markman
>            Assignee: Deneche A. Hakim
>             Fix For: 1.0.0
>
>         Attachments: DRILL-2602.1.patch.txt, DRILL-2602.2.patch.txt, 
> DRILL-2602.3.patch.txt, DRILL-2602.4.patch.txt, optional.parquet, 
> required.parquet
>
>
> We don't recoginize schema change during streaming aggregation when column is 
> a mix of required and optional types.
> Hash aggregation does throw correct error message.
> I have a table 'mix' where:
> {code}
> [Fri Mar 27 09:46:07 root@/mapr/vmarkman.cluster.com/drill/testdata/joins/mix 
> ] # ls -ltr
> total 753
> -rwxr-xr-x 1 root root 759879 Mar 27 09:41 optional.parquet
> -rwxr-xr-x 1 root root   9867 Mar 27 09:41 required.parquet
> [Fri Mar 27 09:46:09 root@/mapr/vmarkman.cluster.com/drill/testdata/joins/mix 
> ] # ~/parquet-tools-1.5.1-SNAPSHOT/parquet-schema optional.parquet
> message root {
>   optional binary c_varchar (UTF8);
>   optional int32 c_integer;
>   optional int64 c_bigint;
>   optional float c_float;
>   optional double c_double;
>   optional int32 c_date (DATE);
>   optional int32 c_time (TIME);
>   optional int64 c_timestamp (TIMESTAMP);
>   optional boolean c_boolean;
>   optional double d9;
>   optional double d18;
>   optional double d28;
>   optional double d38;
> }
> [Fri Mar 27 09:46:41 root@/mapr/vmarkman.cluster.com/drill/testdata/joins/mix 
> ] # ~/parquet-tools-1.5.1-SNAPSHOT/parquet-schema required.parquet
> message root {
>   required binary c_varchar (UTF8);
>   required int32 c_integer;
>   required int64 c_bigint;
>   required float c_float;
>   required double c_double;  required int32 c_date (DATE);
>   required int32 c_time (TIME);
>   required int64 c_timestamp (TIMESTAMP);
>   required boolean c_boolean;
>   required double d9;
>   required double d18;
>   required double d28;
>   required double d38;
> }
> {code}
> Nice error message on hash aggregation:
> {code}
> 0: jdbc:drill:schema=dfs> select count(*) from mix group by c_integer;
> +------------+
> |   EXPR$0   |
> +------------+
> Query failed: Query stopped., Hash aggregate does not support schema changes 
> [ 2bc255ce-c7f9-47bf-80b0-a5c87cfa67be on atsqa4-134.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>         at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>         at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>         at sqlline.SqlLine.print(SqlLine.java:1809)
>         at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>         at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>         at sqlline.SqlLine.dispatch(SqlLine.java:889)
>         at sqlline.SqlLine.begin(SqlLine.java:763)
>         at sqlline.SqlLine.start(SqlLine.java:498)
>         at sqlline.SqlLine.main(SqlLine.java:460)
> {code}
> On streaming aggregation, exception that is hard for the end user to 
> understand:
> {code}
> 0: jdbc:drill:schema=dfs> alter session set `planner.enable_hashagg` = false;
> +------------+------------+
> |     ok     |  summary   |
> +------------+------------+
> | true       | planner.enable_hashagg updated. |
> +------------+------------+
> 1 row selected (0.067 seconds)
> 0: jdbc:drill:schema=dfs> select count(*) from mix group by c_integer;
> +------------+
> |   EXPR$0   |
> +------------+
> Query failed: RemoteRpcException: Failure while running fragment., Failure 
> while reading vector.  Expected vector class of 
> org.apache.drill.exec.vector.IntVector but was holding vector class 
> org.apache.drill.exec.vector.NullableIntVector. [ 
> 5610e589-38e0-4dc5-a560-649516180ba4 on atsqa4-134.qa.lab:31010 ]
> [ 5610e589-38e0-4dc5-a560-649516180ba4 on atsqa4-134.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>         at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>         at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>         at sqlline.SqlLine.print(SqlLine.java:1809)
>         at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
>         at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
>         at sqlline.SqlLine.dispatch(SqlLine.java:889)
>         at sqlline.SqlLine.begin(SqlLine.java:763)
>         at sqlline.SqlLine.start(SqlLine.java:498)
>         at sqlline.SqlLine.main(SqlLine.java:460)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2602) Throw an error on schema change during streaming aggregation

Reply via email to