[ 
https://issues.apache.org/jira/browse/DRILL-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503946#comment-16503946
 ] 

Khurram Faraaz commented on DRILL-6453:
---------------------------------------

[~ben-zvi] TPC-DS query 72 can not be executed on latest Apache master, due to 
a known issue in the Parquet reader.

[~dgu-atmapr] can you please file a Jira for the below issue ?
{noformat}
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Error in 
parquet record reader.
Message: 
Hadoop path: 
/drill/testdata/tpcds_sf1/parquet/customer_demographics/0_0_0.parquet
Total records read: 0
Row group index: 0
Records in row group: 1920800
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
  optional int64 cd_demo_sk;
  optional binary cd_gender (UTF8);
  optional binary cd_marital_status (UTF8);
  optional binary cd_education_status (UTF8);
  optional int32 cd_purchase_estimate;
  optional binary cd_credit_rating (UTF8);
  optional int32 cd_dep_count;
  optional int32 cd_dep_employed_count;
  optional int32 cd_dep_college_count;
}
, metadata: {drill-writer.version=2, drill.version=1.13.0-SNAPSHOT}}, blocks: 
[BlockMetaData{1920800, 112509832 [ColumnMetaData{UNCOMPRESSED [cd_demo_sk] 
INT64  [BIT_PACKED, RLE, PLAIN], 4}, ColumnMetaData{UNCOMPRESSED [cd_gender] 
BINARY  [BIT_PACKED, RLE, PLAIN], 15367257}, ColumnMetaData{UNCOMPRESSED 
[cd_marital_status] BINARY  [BIT_PACKED, RLE, PLAIN], 24971685}, 
ColumnMetaData{UNCOMPRESSED [cd_education_status] BINARY  [BIT_PACKED, RLE, 
PLAIN], 34576113}, ColumnMetaData{UNCOMPRESSED [cd_purchase_estimate] INT32  
[BIT_PACKED, RLE, PLAIN], 60645586}, ColumnMetaData{UNCOMPRESSED 
[cd_credit_rating] BINARY  [BIT_PACKED, RLE, PLAIN], 68329176}, 
ColumnMetaData{UNCOMPRESSED [cd_dep_count] INT32  [BIT_PACKED, RLE, PLAIN], 
89459066}, ColumnMetaData{UNCOMPRESSED [cd_dep_employed_count] INT32  
[BIT_PACKED, RLE, PLAIN], 97142656}, ColumnMetaData{UNCOMPRESSED 
[cd_dep_college_count] INT32  [BIT_PACKED, RLE, PLAIN], 104826246}]}]}
        at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleException(ParquetRecordReader.java:275)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:302)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:172) 
[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        ... 21 common frames omitted
Caused by: java.lang.UnsupportedOperationException: Unsupoorted Operation
        at 
org.apache.drill.exec.store.parquet.columnreaders.PageReader.resetDefinitionLevelReader(PageReader.java:449)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput$VLColumnBulkInputCallback.resetDefinitionLevelReader(VarLenColumnBulkInput.java:422)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getEntry(VarLenBulkPageReader.java:113)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:128)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:32)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:624)
 ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(NullableVarCharVector.java:719)
 ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.VarLengthColumnReaders$NullableVarCharColumn.setSafe(VarLengthColumnReaders.java:215)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.VarLengthValuesColumn.readRecordsInBulk(VarLengthValuesColumn.java:98)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readRecordsInBulk(VarLenBinaryReader.java:102)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readFields(VarLenBinaryReader.java:90)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.BatchReader$VariableWidthReader.readRecords(BatchReader.java:166)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.BatchReader.readBatch(BatchReader.java:42)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:300)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
        ... 22 common frames omitted{noformat}

> TPC-DS query 72 has regressed
> -----------------------------
>
>                 Key: DRILL-6453
>                 URL: https://issues.apache.org/jira/browse/DRILL-6453
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.14.0
>            Reporter: Khurram Faraaz
>            Assignee: Boaz Ben-Zvi
>            Priority: Blocker
>             Fix For: 1.14.0
>
>         Attachments: 24f75b18-014a-fb58-21d2-baeab5c3352c.sys.drill
>
>
> TPC-DS query 72 seems to have regressed, query profile for the case where it 
> Canceled after 2 hours on Drill 1.14.0 is attached here.
> {noformat}
> On, Drill 1.14.0-SNAPSHOT 
> commit : 931b43e (TPC-DS query 72 executed successfully on this commit, took 
> around 55 seconds to execute)
> SF1 parquet data on 4 nodes; 
> planner.memory.max_query_memory_per_node = 10737418240. 
> drill.exec.hashagg.fallback.enabled = true
> TPC-DS query 72 executed successfully & took 47 seconds to complete execution.
> {noformat}
> {noformat}
> TPC-DS data in the below run has date values stored as DATE datatype and not 
> VARCHAR type
> On, Drill 1.14.0-SNAPSHOT
> commit : 82e1a12
> SF1 parquet data on 4 nodes; 
> planner.memory.max_query_memory_per_node = 10737418240. 
> drill.exec.hashagg.fallback.enabled = true
> and
> alter system set `exec.hashjoin.num_partitions` = 1;
> TPC-DS query 72 executed for 2 hrs and 11 mins and did not complete, I had to 
> Cancel it by stopping the Foreman drillbit.
> As a result several minor fragments are reported to be in 
> CANCELLATION_REQUESTED state on UI.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to