[ https://issues.apache.org/jira/browse/ORC-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated ORC-991: ------------------------------ Priority: Blocker (was: Major) > enctypt data throw exception with a sql filter push down > -------------------------------------------------------- > > Key: ORC-991 > URL: https://issues.apache.org/jira/browse/ORC-991 > Project: ORC > Issue Type: Bug > Components: Java > Affects Versions: 1.6.8, 1.6.9, 1.6.10 > Environment: 1.ORC 1.6.8+ > 2.SparkSQL 2.4.7 > 3.JDK 1.8 > Reporter: hgs > Priority: Blocker > Attachments: files.zip > > > 1.create a table > CREATE TABLE `itmp8888`(`id` INT, `name` STRING) > ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > WITH SERDEPROPERTIES ( > 'serialization.format' = '1' > ) > STORED AS > INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > TBLPROPERTIES ( > 'transient_lastDdlTime' = '1631174384', > 'orc.encrypt' = 'AES_CTR_128:id,name', > 'orc.mask' = 'sha256:id,name', > 'orc.encrypt.ezk' = 'jNCeDBtNfT8wPaTpR34JHA==' > ) > 2. insert data > 3. a select statement that no filters is fine > select * from itmp8888 > 4. a select statement with the filter including the encrypted column will > throw exception > select * from itmp8888 where id = 1 > > 5.the stack trace > Caused by: java.lang.AssertionError: Index is not populated for 1Caused by: > java.lang.AssertionError: Index is not populated for 1 at > org.apache.orc.impl.RecordReaderImpl$SargApplier.pickRowGroups(RecordReaderImpl.java:995) > at > org.apache.orc.impl.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:1083) > at > org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1101) > at > org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1151) > at > org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1186) > at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:248) at > org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:864) at > org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initialize(OrcColumnarBatchReader.java:142) > at > org.apache.spark.sql.execution.datasources.orc.OrcFileFormat$$anonfun$buildReaderWithPartitionValues$2.apply(OrcFileFormat.scala:211) > at > org.apache.spark.sql.execution.datasources.orc.OrcFileFormat$$anonfun$buildReaderWithPartitionValues$2.apply(OrcFileFormat.scala:175) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:124) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:177) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:101) > 6. I debug the code find that the RowIndex is null for all the encrypted > columns > -- This message was sent by Atlassian Jira (v8.3.4#803005)