[ 
https://issues.apache.org/jira/browse/DRILL-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

salim achouche updated DRILL-6685:
----------------------------------
    Labels: pull-request-available  (was: )

> Error in parquet record reader
> ------------------------------
>
>                 Key: DRILL-6685
>                 URL: https://issues.apache.org/jira/browse/DRILL-6685
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 1.14.0
>            Reporter: Robert Hou
>            Assignee: salim achouche
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.15.0
>
>         Attachments: drillbit.log.6685
>
>
> This is the query:
> select VarbinaryValue1 from 
> dfs.`/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet` limit 
> 36;
> It appears to be caused by this commit:
> DRILL-6570: Fixed IndexOutofBoundException in Parquet Reader
> aee899c1b26ebb9a5781d280d5a73b42c273d4d5
> This is the stack trace:
> {noformat}
> Error: INTERNAL_ERROR ERROR: Error in parquet record reader.
> Message: 
> Hadoop path: 
> /drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet/0_0_0.parquet
> Total records read: 0
> Row group index: 0
> Records in row group: 1250
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
>   optional int64 Index;
>   optional binary VarbinaryValue1;
>   optional int64 BigIntValue;
>   optional boolean BooleanValue;
>   optional int32 DateValue (DATE);
>   optional float FloatValue;
>   optional binary VarcharValue1 (UTF8);
>   optional double DoubleValue;
>   optional int32 IntegerValue;
>   optional int32 TimeValue (TIME_MILLIS);
>   optional int64 TimestampValue (TIMESTAMP_MILLIS);
>   optional binary VarbinaryValue2;
>   optional fixed_len_byte_array(12) IntervalYearValue (INTERVAL);
>   optional fixed_len_byte_array(12) IntervalDayValue (INTERVAL);
>   optional fixed_len_byte_array(12) IntervalSecondValue (INTERVAL);
>   optional binary VarcharValue2 (UTF8);
> }
> , metadata: {drill-writer.version=2, drill.version=1.14.0-SNAPSHOT}}, blocks: 
> [BlockMetaData{1250, 23750308 [ColumnMetaData{UNCOMPRESSED [Index] optional 
> int64 Index  [PLAIN, RLE, BIT_PACKED], 4}, ColumnMetaData{UNCOMPRESSED 
> [VarbinaryValue1] optional binary VarbinaryValue1  [PLAIN, RLE, BIT_PACKED], 
> 10057}, ColumnMetaData{UNCOMPRESSED [BigIntValue] optional int64 BigIntValue  
> [PLAIN, RLE, BIT_PACKED], 8174655}, ColumnMetaData{UNCOMPRESSED 
> [BooleanValue] optional boolean BooleanValue  [PLAIN, RLE, BIT_PACKED], 
> 8179722}, ColumnMetaData{UNCOMPRESSED [DateValue] optional int32 DateValue 
> (DATE)  [PLAIN, RLE, BIT_PACKED], 8179916}, ColumnMetaData{UNCOMPRESSED 
> [FloatValue] optional float FloatValue  [PLAIN, RLE, BIT_PACKED], 8184959}, 
> ColumnMetaData{UNCOMPRESSED [VarcharValue1] optional binary VarcharValue1 
> (UTF8)  [PLAIN, RLE, BIT_PACKED], 8190002}, ColumnMetaData{UNCOMPRESSED 
> [DoubleValue] optional double DoubleValue  [PLAIN, RLE, BIT_PACKED], 
> 10230058}, ColumnMetaData{UNCOMPRESSED [IntegerValue] optional int32 
> IntegerValue  [PLAIN, RLE, BIT_PACKED], 10240111}, 
> ColumnMetaData{UNCOMPRESSED [TimeValue] optional int32 TimeValue 
> (TIME_MILLIS)  [PLAIN, RLE, BIT_PACKED], 10245154}, 
> ColumnMetaData{UNCOMPRESSED [TimestampValue] optional int64 TimestampValue 
> (TIMESTAMP_MILLIS)  [PLAIN, RLE, BIT_PACKED], 10250197}, 
> ColumnMetaData{UNCOMPRESSED [VarbinaryValue2] optional binary VarbinaryValue2 
>  [PLAIN, RLE, BIT_PACKED], 10260250}, ColumnMetaData{UNCOMPRESSED 
> [IntervalYearValue] optional fixed_len_byte_array(12) IntervalYearValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19632385}, ColumnMetaData{UNCOMPRESSED 
> [IntervalDayValue] optional fixed_len_byte_array(12) IntervalDayValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19647446}, ColumnMetaData{UNCOMPRESSED 
> [IntervalSecondValue] optional fixed_len_byte_array(12) IntervalSecondValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19662507}, ColumnMetaData{UNCOMPRESSED 
> [VarcharValue2] optional binary VarcharValue2 (UTF8)  [PLAIN, RLE, 
> BIT_PACKED], 19677568}]}]}
> Fragment 0:0
> [Error Id: 25852cdb-3217-4041-9743-66e9f3a2fbe4 on qa-node186.qa.lab:31010] 
> (state=,code=0)
> {noformat}
> Table can be found in 10.10.100.186:/tmp/fourvarchar_asc_nulls_16MB.parquet
> sys.version is:
> 1.15.0-SNAPSHOT a05f17d6fcd80f0d21260d3b1074ab895f457bac        Changed 
> PROJECT_OUTPUT_BATCH_SIZE to System + Session   30.07.2018 @ 17:12:53 PDT     
>   r...@mapr.com   30.07.2018 @ 17:25:21 PDT^M
> fourvarchar_asc_nulls70.q



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to