[ https://issues.apache.org/jira/browse/DRILL-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rahul Challapalli updated DRILL-1963: ------------------------------------- Attachment: morecolumns2.parquet Could not attach the json file here as it is 19.23MB. Below is a vague description of the shape of the json/parquet file {code} { "depth":1, "noOfScalarTypes":5000, "repeatedScalarTypesInfo": [100,100,100,100,100,100,100,100,100,100,100,100], "noOfComplexTypes":1, "repeatedComplexTypesInfo": [] } { "depth":2, "noOfScalarTypes":1000, "repeatedScalarTypesInfo": [10, 5, 7, 4, 7, 25, 5, 10, 73, 5, 15, 20], "noOfComplexTypes":0, "repeatedComplexTypesInfo": [1] } { "depth":3, "noOfScalarTypes":100, "repeatedScalarTypesInfo": [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10], "noOfComplexTypes":0, "repeatedComplexTypesInfo": [] } {code} > select * on a parquet file created from json with large no of columns fail > -------------------------------------------------------------------------- > > Key: DRILL-1963 > URL: https://issues.apache.org/jira/browse/DRILL-1963 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet, Storage - Writer > Reporter: Rahul Challapalli > Assignee: Parth Chandra > Attachments: morecolumns2.parquet > > > git.commit.id.abbrev=b491cdb > The datafile (parquet) is create by drill from a json file. The below failing > query works fine on the json file. Attached both the json and drill generated > parquet files > {code} > 0: jdbc:drill:schema=dfs.wide-columns> select * from morecolumns2; > -- drill prints all the columns....just ignoring them because of space > java.lang.IndexOutOfBoundsException: index: 0, length: 8 (expected: range(0, > 0)) > at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:156) > at io.netty.buffer.DrillBuf.chk(DrillBuf.java:178) > at io.netty.buffer.DrillBuf.getLong(DrillBuf.java:420) > at > org.apache.drill.exec.vector.BigIntVector$Accessor.get(BigIntVector.java:297) > at > org.apache.drill.exec.vector.BigIntVector$Accessor.getObject(BigIntVector.java:302) > at > org.apache.drill.exec.vector.RepeatedBigIntVector$Accessor.getObject(RepeatedBigIntVector.java:332) > at > org.apache.drill.exec.vector.RepeatedBigIntVector$Accessor.getObject(RepeatedBigIntVector.java:308) > at > org.apache.drill.exec.vector.accessor.GenericAccessor.getObject(GenericAccessor.java:38) > at > org.apache.drill.jdbc.AvaticaDrillSqlAccessor.getObject(AvaticaDrillSqlAccessor.java:136) > at > net.hydromatic.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:351) > at sqlline.SqlLine$Rows$Row.<init>(SqlLine.java:2388) > at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2504) > at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148) > at sqlline.SqlLine.print(SqlLine.java:1809) > at sqlline.SqlLine$Commands.execute(SqlLine.java:3766) > at sqlline.SqlLine$Commands.sql(SqlLine.java:3663) > at sqlline.SqlLine.dispatch(SqlLine.java:889) > at sqlline.SqlLine.begin(SqlLine.java:763) > at sqlline.SqlLine.start(SqlLine.java:498) > at sqlline.SqlLine.main(SqlLine.java:460) > {code} > Error from the logs : > {code} > 2015-01-08 22:06:37,780 [2b5100a6-c361-8633-5ede-5ae4968b35d0:frag:0:0] WARN > o.a.d.e.p.impl.SendingAccountor - Failure while waiting for send complete. > java.lang.InterruptedException: null > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:996) > ~[na:1.7.0_71] > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) > ~[na:1.7.0_71] > at java.util.concurrent.Semaphore.acquire(Semaphore.java:472) > ~[na:1.7.0_71] > at > org.apache.drill.exec.physical.impl.SendingAccountor.waitForSendComplete(SendingAccountor.java:44) > ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.stop(ScreenCreator.java:186) > [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:144) > [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:119) > [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at > org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254) > [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_71] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_71] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] > 2015-01-08 22:06:37,780 [2b5100a6-c361-8633-5ede-5ae4968b35d0:frag:0:0] INFO > o.a.drill.exec.work.foreman.Foreman - Dropping request to move to COMPLETED > state as query is already at CANCELED state (which is terminal). > {code} > The below query also results in a similar exception > {code} > select Obj0_level1 from morecolumns2; > {code} > However the below queries succeed : > {code} > select t.Obj0_level1.STRING960_level2 from morecolumns2 t; > select t.Obj0_level1.Array_Object0_level2[0] from morecolumns2 t; > select t.Obj0_level1.Array_Object0_level2[0].TINYINT22_level3 from > morecolumns2 t; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)