[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250231#comment-15250231 ] Sudheesh Katkam commented on DRILL-4459: Fixed in [3b056db|https://github.com/apache/drill/commit/3b056db0f504d50fe11a6028b1a633ec74d478d2]. > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249284#comment-15249284 ] ASF GitHub Bot commented on DRILL-4459: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/431 > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246050#comment-15246050 ] ASF GitHub Bot commented on DRILL-4459: --- Github user vkorukanti commented on the pull request: https://github.com/apache/drill/pull/431#issuecomment-211483421 Updated PR LGTM, +1. > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231281#comment-15231281 ] ASF GitHub Bot commented on DRILL-4459: --- Github user vdiravka commented on the pull request: https://github.com/apache/drill/pull/431#issuecomment-207126641 @vkorukanti I made a commit with rebasing on master and updating the commit message after your comment. Please make a quick review of this changes. > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221982#comment-15221982 ] ASF GitHub Bot commented on DRILL-4459: --- Github user vkorukanti commented on the pull request: https://github.com/apache/drill/pull/431#issuecomment-204471963 LGTM, +1. > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221949#comment-15221949 ] ASF GitHub Bot commented on DRILL-4459: --- Github user vdiravka commented on the pull request: https://github.com/apache/drill/pull/431#issuecomment-204466782 @vkorukanti Could you please review this pull request? > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208205#comment-15208205 ] ASF GitHub Bot commented on DRILL-4459: --- Github user vdiravka commented on the pull request: https://github.com/apache/drill/pull/431#issuecomment-200290111 Thanks Venki. You are right, actually not all drill functions can accept VAR16CHAR as input. For example SIMILAR TO (LIKE works as hive UDF). That's why I've implemented an approach suggested by you. Now hive STRING converts to drill VARCHAR data type. Also I edited two tests testGenericUDF and testUDF. > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201360#comment-15201360 ] ASF GitHub Bot commented on DRILL-4459: --- Github user vdiravka commented on a diff in the pull request: https://github.com/apache/drill/pull/431#discussion_r56645171 --- Diff: contrib/storage-hive/core/src/test/java/org/apache/drill/exec/fn/hive/TestInbuiltHiveUDFs.java --- @@ -43,4 +47,17 @@ public void testEncode() throws Exception { .baselineValues(new Object[] { null }) .go(); } + + @Test // DRILL-4459 + public void testGetJsonObject() throws Exception { +setColumnWidths(new int[]{260}); +String query = "select * from hive.simple_json where GET_JSON_OBJECT(simple_json.json, '$.DocId') = 'DocId2'"; +List results = testSqlWithResults(query); +String expected = "json\n" + "{\"DocId\":\"DocId2\",\"User\":{\"Id\":122,\"Username\":\"larry122\",\"Name\":" + --- End diff -- I've led this test to a common design. Thanks. @Test // DRILL-4459 public void testGetJsonObject() throws Exception { testBuilder() .sqlQuery("select convert_from(json, 'json') as json from hive.simple_json " + "where GET_JSON_OBJECT(simple_json.json, '$.employee_id') = 'Emp2'") .ordered() .baselineColumns("json") .baselineValues(mapOf("employee_id","2","full_name","Kamesh","first_name","Bh","last_name","Venkata","position","Store")) .go(); } > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197507#comment-15197507 ] ASF GitHub Bot commented on DRILL-4459: --- Github user jaltekruse commented on a diff in the pull request: https://github.com/apache/drill/pull/431#discussion_r56354881 --- Diff: contrib/storage-hive/core/src/test/java/org/apache/drill/exec/fn/hive/TestInbuiltHiveUDFs.java --- @@ -43,4 +47,17 @@ public void testEncode() throws Exception { .baselineValues(new Object[] { null }) .go(); } + + @Test // DRILL-4459 + public void testGetJsonObject() throws Exception { +setColumnWidths(new int[]{260}); +String query = "select * from hive.simple_json where GET_JSON_OBJECT(simple_json.json, '$.DocId') = 'DocId2'"; +List results = testSqlWithResults(query); +String expected = "json\n" + "{\"DocId\":\"DocId2\",\"User\":{\"Id\":122,\"Username\":\"larry122\",\"Name\":" + --- End diff -- Can you specify this baseline as a complex object instead of a string? The testBuilder can be used to check results against java POJOs and it includes helper methods listOF/mapOf for building up complex structures. > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197671#comment-15197671 ] ASF GitHub Bot commented on DRILL-4459: --- Github user vkorukanti commented on the pull request: https://github.com/apache/drill/pull/431#issuecomment-197423446 There are many functions in Drill which accept VARCHAR only and not VAR16CHAR. Why not convert the Hive UDF return type to VARCHAR when the Hive return type is String? That way we don't have to deal with many Drill UDFs that don't accept VAR16CHAR as input yet. For now this case works fine, but the output of Hive UDF is given as input to some other function (ex. LIKE), it wont work we don't have an implementation of LIKE for VAR16Char input? > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4459) SchemaChangeException while querying hive json table
[ https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197107#comment-15197107 ] ASF GitHub Bot commented on DRILL-4459: --- GitHub user vdiravka opened a pull request: https://github.com/apache/drill/pull/431 DRILL-4459: SchemaChangeException while querying hive json table - added Var16Char for comparison drill functions; - added UTest for hive GET_JSON_OBJECT UDF. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vdiravka/drill DRILL-4459 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/431.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #431 commit 71802544886d024fde3c22e6a507a4c7ca537148 Author: Vitalii DiravkaDate: 2016-03-10T14:52:28Z DRILL-4459: SchemaChangeException while querying hive json table - added Var16Char for comparison drill functions; - added UTest for hive GET_JSON_OBJECT UDF. > SchemaChangeException while querying hive json table > > > Key: DRILL-4459 > URL: https://issues.apache.org/jira/browse/DRILL-4459 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Functions - Hive >Affects Versions: 1.4.0 > Environment: MapR-Drill 1.4.0 > Hive-1.2.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka > Fix For: 1.7.0 > > > getting the SchemaChangeException while querying json documents stored in > hive table. > {noformat} > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > {noformat} > minimum reproduce > {noformat} > created sample json documents using the attached script(randomdata.sh) > hive>create table simplejson(json string); > hive>load data local inpath '/tmp/simple.json' into table simplejson; > now query it through Drill. > Drill Version > select * from sys.version; > +---++-+-++ > | commit_id | commit_message | commit_time | build_email | build_time | > +---++-+-++ > | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901: Don't do early > expansion of directory in the non-metadata-cache case because it already > happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 > @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC | > +---++-+-++ > 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where > GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1; > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castBIT(VAR16CHAR-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 1:1 > [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)