[ https://issues.apache.org/jira/browse/PARQUET-246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ryan Blue reopened PARQUET-246: ------------------------------- > ArrayIndexOutOfBoundsException with Parquet write version v2 > ------------------------------------------------------------ > > Key: PARQUET-246 > URL: https://issues.apache.org/jira/browse/PARQUET-246 > Project: Parquet > Issue Type: Bug > Affects Versions: 1.6.0 > Reporter: Konstantin Shaposhnikov > Fix For: 2.0.0 > > > I am getting the following exception when reading a parquet file that was > created using Avro WriteSupport and Parquet write version v2.0: > {noformat} > Caused by: parquet.io.ParquetDecodingException: Can't read value in column > [colName, rows, array, name] BINARY at value 313601 out of 428260, 1 out of > 39200 in currentPage. repetition level: 0, definition level: 2 > at > parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:462) > at > parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:364) > at > parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:405) > at > parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:209) > ... 27 more > Caused by: java.lang.ArrayIndexOutOfBoundsException > at > parquet.column.values.deltastrings.DeltaByteArrayReader.readBytes(DeltaByteArrayReader.java:70) > at > parquet.column.impl.ColumnReaderImpl$2$6.read(ColumnReaderImpl.java:307) > at > parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:458) > ... 30 more > {noformat} > The file is quite big (500Mb) so I cannot upload it here, but possibly there > is enough information in the exception message to understand the cause of > error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)