Khurram Faraaz created DRILL-5456: ------------------------------------- Summary: StringIndexOutOfBoundsException when converting a JSON array to UTF-8 Key: DRILL-5456 URL: https://issues.apache.org/jira/browse/DRILL-5456 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.11.0 Reporter: Khurram Faraaz
Convert a JSON array to UTF-8 using CONVERT_TO function results in StringIndexOutOfBoundsException Apache Drill 1.11.0 commit ID: 3e8b01d Data used in test {noformat} [root@centos-01 ~]# cat rptd_count.json {"arr":[0,1,2,3,4,5,6,7,8,9,10],"id":1} {"arr":[0,1,2,3,4,5,6,7,8,9,10],"id":2} {"arr":[0,1,2,3,4,5,6,7,8,9,10],"id":3} {"arr":[0,1,2,3,4,5,6,7,8,9,10],"id":4} {"arr":[0,1,2,3,4,5,6,7,8,9,10],"id":5} {"arr":[0,1,2,3,4,5,6,7,8,9,10],"id":6} {"arr":[0,1,2,3,4,5,6,7,8,9,10],"id":7} {"arr":[0,1,2,3,4,5,6,7,8,9,10],"id":8} {"arr":[0,1,2,3,4,5,6,7,8,9,10],"id":9} {"arr":[0,1,2,3,4,5,6,7,8,9,10],"id":10} [root@centos-01 ~]# {noformat} {noformat} 0: jdbc:drill:schema=dfs.tmp> select convert_to(t.arr,'UTF-8') c, id from `rptd_count.json` t; Error: SYSTEM ERROR: StringIndexOutOfBoundsException: String index out of range: -3 [Error Id: 056a13c0-6c9f-403e-877e-040e907d6581 on centos-01.qa.lab:31010] (state=,code=0) 0: jdbc:drill:schema=dfs.tmp> {noformat} Data from the JSON file {noformat} 0: jdbc:drill:schema=dfs.tmp> select * from `rptd_count.json`; +---------------------------+-----+ | arr | id | +---------------------------+-----+ | [0,1,2,3,4,5,6,7,8,9,10] | 1 | | [0,1,2,3,4,5,6,7,8,9,10] | 2 | | [0,1,2,3,4,5,6,7,8,9,10] | 3 | | [0,1,2,3,4,5,6,7,8,9,10] | 4 | | [0,1,2,3,4,5,6,7,8,9,10] | 5 | | [0,1,2,3,4,5,6,7,8,9,10] | 6 | | [0,1,2,3,4,5,6,7,8,9,10] | 7 | | [0,1,2,3,4,5,6,7,8,9,10] | 8 | | [0,1,2,3,4,5,6,7,8,9,10] | 9 | | [0,1,2,3,4,5,6,7,8,9,10] | 10 | +---------------------------+-----+ 10 rows selected (0.224 seconds) {noformat} Stack trace from drillbit.log {noformat} 2017-05-01 19:32:34,209 [26f872ad-62a3-d7a7-aec1-9c7d937a2416:foreman] INFO o.a.drill.exec.work.foreman.Foreman - Query text for query id 26f872ad-62a3-d7a7-aec1-9c7d937a2416: select convert_to(t.arr,'UTF-8') c, id from `rptd_count.json` t ... 2017-05-01 19:32:34,328 [26f872ad-62a3-d7a7-aec1-9c7d937a2416:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: StringIndexOutOfBoundsException: String index out of range: -3 [Error Id: 056a13c0-6c9f-403e-877e-040e907d6581 on centos-01.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: StringIndexOutOfBoundsException: String index out of range: -3 [Error Id: 056a13c0-6c9f-403e-877e-040e907d6581 on centos-01.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544) ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:847) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:977) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:297) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_91] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_91] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91] Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: String index out of range: -3 ... 4 common frames omitted Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -3 at java.lang.String.substring(String.java:1931) ~[na:1.8.0_91] at org.apache.drill.exec.planner.logical.PreProcessLogicalRel.getConvertFunctionException(PreProcessLogicalRel.java:244) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.PreProcessLogicalRel.visit(PreProcessLogicalRel.java:148) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.calcite.rel.logical.LogicalProject.accept(LogicalProject.java:129) ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.preprocessNode(DefaultSqlHandler.java:641) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:196) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:164) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:131) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1050) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:280) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] ... 3 common frames omitted {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)