Chun Chang created DRILL-1962:
---------------------------------

             Summary: accessing nested array from multiple files causing 
IndexOutOfBoundException
                 Key: DRILL-1962
                 URL: https://issues.apache.org/jira/browse/DRILL-1962
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Operators
    Affects Versions: 0.8.0
            Reporter: Chun Chang
            Assignee: Chris Westin


#Wed Jan 07 18:54:07 EST 2015
git.commit.id.abbrev=35a350f

If the dataset contains nested array of array, and the data is contained in 
more than one file, accessing the second nested array cause 
IndexOutOfBoundsException. For example, with the following dataset:

{code}
{
    "id": 2,
    "oooa": {
        "oa": {
            "oab": {
                "oabc": [
                    {
                        "rowId": 2
                    },
                    {
                        "rowValue1": [{"rv1":1, "rv2":2}, {"rva1":3, "rva2":4}],
                        "rowValue2": [{"rw1":1, "rw2":2}, {"rwa1":3, "rwa2":4}]
                    }
                ]
            }
        }
    }
}
{code}

If you put it in two separate files in the same directory, query using wild 
card to accessing the two files at the second array level will cause the 
exception.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select 
t.oooa.oa.oab.oabc[1].rowValue1 from `jira2file/jira*.json` t;
+------------+
|   EXPR$0   |
+------------+
| [{"rv1":1,"rv2":2},{"rva1":3,"rva2":4}] |
Query failed: Query failed: Failure while running fragment., index: -4, length: 
4 (expected: range(0, 16384)) [ 78235243-4f01-4ee3-9675-fc18bd1e66e3 on 
qa-node120.qa.lab:31010 ]
[ 78235243-4f01-4ee3-9675-fc18bd1e66e3 on qa-node120.qa.lab:31010 ]


java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
query.
        at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
        at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
        at sqlline.SqlLine.print(SqlLine.java:1809)
        at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
        at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
        at sqlline.SqlLine.dispatch(SqlLine.java:889)
        at sqlline.SqlLine.begin(SqlLine.java:763)
        at sqlline.SqlLine.start(SqlLine.java:498)
        at sqlline.SqlLine.main(SqlLine.java:460)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ>
{code}

Same query on single file works.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select 
t.oooa.oa.oab.oabc[1].rowValue1 from `jira2file/jira1.json` t;
+------------+
|   EXPR$0   |
+------------+
| [{"rv1":1,"rv2":2},{"rva1":3,"rva2":4}] |
+------------+
{code}

stack trace:

{code}
2015-01-08 14:00:16,127 [2b51020e-daab-a903-ef7c-ef6f9bd606c7:frag:0:0] WARN  
o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
fragment
java.lang.IndexOutOfBoundsException: index: -4, length: 4 (expected: range(0, 
16384))
        at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:156) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
        at io.netty.buffer.DrillBuf.chk(DrillBuf.java:178) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
        at io.netty.buffer.DrillBuf.getInt(DrillBuf.java:447) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
        at 
org.apache.drill.exec.vector.UInt4Vector$Accessor.get(UInt4Vector.java:297) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.vector.complex.RepeatedMapVector$RepeatedMapAccessor.get(RepeatedMapVector.java:542)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.vector.complex.impl.RepeatedMapReaderImpl.setPosition(RepeatedMapReaderImpl.java:90)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.vector.complex.impl.RepeatedMapReaderImpl.setChildrenPosition(RepeatedMapReaderImpl.java:45)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.vector.complex.impl.RepeatedMapReaderImpl.setPosition(RepeatedMapReaderImpl.java:96)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.test.generated.ProjectorGen4732.doEval(ProjectorTemplate.java:30)
 ~[na:na]
        at 
org.apache.drill.exec.test.generated.ProjectorGen4732.projectRecords(ProjectorTemplate.java:64)
 ~[na:na]
        at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:172)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:132)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:97)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:114)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_45]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_45]
        at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
{code}

physical plan:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select 
t.oooa.oa.oab.oabc[1].rowValue1 from `jira2file/jira*.json` t;
+------------+------------+
|    text    |    json    |
+------------+------------+
| 00-00    Screen
00-01      Project(EXPR$0=[ITEM(ITEM(ITEM(ITEM(ITEM($0, 'oa'), 'oab'), 'oabc'), 
1), 'rowValue1')])
00-02        Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/jira2file, numFiles=2, 
columns=[`oooa`.`oa`.`oab`.`oabc`[1].`rowValue1`], 
files=[maprfs:/drill/testdata/complex_type/json/jira2file/jira1.json, 
maprfs:/drill/testdata/complex_type/json/jira2file/jira2.json]]])
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to