[
https://issues.apache.org/jira/browse/HIVE-17448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aniket Mokashi reassigned HIVE-17448:
-------------------------------------
Assignee: Aniket Mokashi
> ArrayIndexOutOfBoundsException on ORC tables after adding a struct field
> ------------------------------------------------------------------------
>
> Key: HIVE-17448
> URL: https://issues.apache.org/jira/browse/HIVE-17448
> Project: Hive
> Issue Type: Bug
> Components: ORC
> Affects Versions: 2.1.1
> Environment: Reproduced on Dataproc 1.1, 1.2 (Hive 2.1).
> Reporter: Nikolay Sokolov
> Assignee: Aniket Mokashi
> Priority: Minor
> Attachments: HIVE-17448.1-branch-2.1.patch
>
>
> When ORC files have been created with older schema, which had smaller set of
> struct fields, and schema have been changed to one with more struct fields,
> and there are sibling fields of struct type going after struct itself,
> ArrayIndexOutOfBoundsException is being thrown. Steps to reproduce:
> {code:none}
> create external table test_broken_struct(a struct<f1:int, f2:int>, b int)
> stored as orc;
> insert into table test_broken_struct
> select named_struct("f1", 1, "f2", 2), 3;
> drop table test_broken_struct;
> create external table test_broken_struct(a struct<f1:int, f2:int, f3:int>, b
> int) stored as orc;
> select * from test_broken_struct;
> {code}
> Same scenario is not causing crash on hive 0.14.
> Debug log and stack trace:
> {code:none}
> 2017-09-07T00:21:40,266 INFO [main] orc.OrcInputFormat: Using schema
> evolution configuration variables schema.evol
> ution.columns [a, b] / schema.evolution.columns.types
> [struct<f1:int,f2:int,f3:int>, int] (isAcidRead false)
> 2017-09-07T00:21:40,267 DEBUG [main] orc.OrcInputFormat: No ORC pushdown
> predicate
> 2017-09-07T00:21:40,267 INFO [main] orc.ReaderImpl: Reading ORC rows from
> hdfs://cluster-7199-m/user/hive/warehous
> e/test_broken_struct/000000_0 with {include: [true, true, true, true, true],
> offset: 3, length: 159, schema: struct
> <a:struct<f1:int,f2:int,f3:int>,b:int>}
> Failed with exception
> java.io.IOException:java.lang.ArrayIndexOutOfBoundsException: 5
> 2017-09-07T00:21:40,273 ERROR [main] CliDriver: Failed with exception
> java.io.IOException:java.lang.ArrayIndexOutOf
> BoundsException: 5
> java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 5
> at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
> at
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2098)
> at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:252)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 5
> at
> org.apache.orc.impl.SchemaEvolution.buildConversionFileTypesArray(SchemaEvolution.java:195)
> at
> org.apache.orc.impl.SchemaEvolution.buildConversionFileTypesArray(SchemaEvolution.java:253)
> at org.apache.orc.impl.SchemaEvolution.<init>(SchemaEvolution.java:59)
> at
> org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:149)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:63)
> at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:87)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:314)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.<init>(OrcInputFormat.java:225)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1691)
> at
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:69
> 5)
> at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:333)
> at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459)
> ... 15 more
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)