Stefán Baxter created DRILL-3563: ------------------------------------ Summary: Type confusion and number formatting exceptions Key: DRILL-3563 URL: https://issues.apache.org/jira/browse/DRILL-3563 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.1.0 Reporter: Stefán Baxter Assignee: Jinfeng Ni
It seems that null values can trigger a column to be treated as a numeric one, in expressions evaluation, regardless of content or other indicators and that fields in substructures can affect same-named-fields in parent structure. (1.2-SNAPSHOT, parquet files) I have JSON data that can be reduced to to this: {"occurred_at":"2015-07-26 08:45:41.234","type":"plan.item.added","dimensions":{"type":null,"dim_type":"Unspecified","category":"Unspecified","sub_category":null}} {"occurred_at":"2015-07-26 08:45:43.598","type":"plan.item.removed","dimensions":{"type":"Unspecified","dim_type":null,"category":"Unspecified","sub_category":null}} {"occurred_at":"2015-07-26 08:45:44.241","type":"plan.item.removed","dimensions":{"type":"To See","category":"Nature","sub_category":"Waterfalls"}} * notice the discrepancy in the dimensions structure that the type field is either called type or dim_type (slightly relevant for the rest of this case) 1. Query where dimensions are not involved select p.type, count(*) from dfs.tmp.`/analytics/processed/<some-tenant>/events` as p where occurred_at > '2015-07-26' and p.type in ('plan.item.added','plan.item.removed') group by p.type; +--------------------+---------+ | type | EXPR$1 | +--------------------+---------+ | plan.item.removed | 947 | | plan.item.added | 40342 | +--------------------+---------+ 2 rows selected (0.508 seconds) 2. Same query but involves dimension.type as well select p.type, coalesce(p.dimensions.dim_type, p.dimensions.type) dimensions_type, count(*) from dfs.tmp.`/analytics/processed/<some-tenant>/events` as p where occurred_at > '2015-07-26' and p.type in ('plan.item.added','plan.item.removed') group by p.type, coalesce(p.dimensions.dim_type, p.dimensions.type); Error: SYSTEM ERROR: NumberFormatException: To See Fragment 2:0 [Error Id: 4756f549-cc47-43e5-899e-10a11efb60ea on localhost:31010] (state=,code=0) I can provide test data if this is not enough to reproduce this bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)