[
https://issues.apache.org/jira/browse/DRILL-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503946#comment-16503946
]
Khurram Faraaz commented on DRILL-6453:
---------------------------------------
[~ben-zvi] TPC-DS query 72 can not be executed on latest Apache master, due to
a known issue in the Parquet reader.
[~dgu-atmapr] can you please file a Jira for the below issue ?
{noformat}
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Error in
parquet record reader.
Message:
Hadoop path:
/drill/testdata/tpcds_sf1/parquet/customer_demographics/0_0_0.parquet
Total records read: 0
Row group index: 0
Records in row group: 1920800
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
optional int64 cd_demo_sk;
optional binary cd_gender (UTF8);
optional binary cd_marital_status (UTF8);
optional binary cd_education_status (UTF8);
optional int32 cd_purchase_estimate;
optional binary cd_credit_rating (UTF8);
optional int32 cd_dep_count;
optional int32 cd_dep_employed_count;
optional int32 cd_dep_college_count;
}
, metadata: {drill-writer.version=2, drill.version=1.13.0-SNAPSHOT}}, blocks:
[BlockMetaData{1920800, 112509832 [ColumnMetaData{UNCOMPRESSED [cd_demo_sk]
INT64 [BIT_PACKED, RLE, PLAIN], 4}, ColumnMetaData{UNCOMPRESSED [cd_gender]
BINARY [BIT_PACKED, RLE, PLAIN], 15367257}, ColumnMetaData{UNCOMPRESSED
[cd_marital_status] BINARY [BIT_PACKED, RLE, PLAIN], 24971685},
ColumnMetaData{UNCOMPRESSED [cd_education_status] BINARY [BIT_PACKED, RLE,
PLAIN], 34576113}, ColumnMetaData{UNCOMPRESSED [cd_purchase_estimate] INT32
[BIT_PACKED, RLE, PLAIN], 60645586}, ColumnMetaData{UNCOMPRESSED
[cd_credit_rating] BINARY [BIT_PACKED, RLE, PLAIN], 68329176},
ColumnMetaData{UNCOMPRESSED [cd_dep_count] INT32 [BIT_PACKED, RLE, PLAIN],
89459066}, ColumnMetaData{UNCOMPRESSED [cd_dep_employed_count] INT32
[BIT_PACKED, RLE, PLAIN], 97142656}, ColumnMetaData{UNCOMPRESSED
[cd_dep_college_count] INT32 [BIT_PACKED, RLE, PLAIN], 104826246}]}]}
at
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleException(ParquetRecordReader.java:275)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:302)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:172)
[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
... 21 common frames omitted
Caused by: java.lang.UnsupportedOperationException: Unsupoorted Operation
at
org.apache.drill.exec.store.parquet.columnreaders.PageReader.resetDefinitionLevelReader(PageReader.java:449)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput$VLColumnBulkInputCallback.resetDefinitionLevelReader(VarLenColumnBulkInput.java:422)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getEntry(VarLenBulkPageReader.java:113)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:128)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:32)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:624)
~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(NullableVarCharVector.java:719)
~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.VarLengthColumnReaders$NullableVarCharColumn.setSafe(VarLengthColumnReaders.java:215)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.VarLengthValuesColumn.readRecordsInBulk(VarLengthValuesColumn.java:98)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readRecordsInBulk(VarLenBinaryReader.java:102)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readFields(VarLenBinaryReader.java:90)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.BatchReader$VariableWidthReader.readRecords(BatchReader.java:166)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.BatchReader.readBatch(BatchReader.java:42)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:300)
~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
... 22 common frames omitted{noformat}
> TPC-DS query 72 has regressed
> -----------------------------
>
> Key: DRILL-6453
> URL: https://issues.apache.org/jira/browse/DRILL-6453
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 1.14.0
> Reporter: Khurram Faraaz
> Assignee: Boaz Ben-Zvi
> Priority: Blocker
> Fix For: 1.14.0
>
> Attachments: 24f75b18-014a-fb58-21d2-baeab5c3352c.sys.drill
>
>
> TPC-DS query 72 seems to have regressed, query profile for the case where it
> Canceled after 2 hours on Drill 1.14.0 is attached here.
> {noformat}
> On, Drill 1.14.0-SNAPSHOT
> commit : 931b43e (TPC-DS query 72 executed successfully on this commit, took
> around 55 seconds to execute)
> SF1 parquet data on 4 nodes;
> planner.memory.max_query_memory_per_node = 10737418240.
> drill.exec.hashagg.fallback.enabled = true
> TPC-DS query 72 executed successfully & took 47 seconds to complete execution.
> {noformat}
> {noformat}
> TPC-DS data in the below run has date values stored as DATE datatype and not
> VARCHAR type
> On, Drill 1.14.0-SNAPSHOT
> commit : 82e1a12
> SF1 parquet data on 4 nodes;
> planner.memory.max_query_memory_per_node = 10737418240.
> drill.exec.hashagg.fallback.enabled = true
> and
> alter system set `exec.hashjoin.num_partitions` = 1;
> TPC-DS query 72 executed for 2 hrs and 11 mins and did not complete, I had to
> Cancel it by stopping the Foreman drillbit.
> As a result several minor fragments are reported to be in
> CANCELLATION_REQUESTED state on UI.
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)