[ https://issues.apache.org/jira/browse/HIVE-21327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781231#comment-16781231 ]
Hive QA commented on HIVE-21327: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12960627/HIVE-21327.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15824 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=11) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] (batchId=86) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16301/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16301/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16301/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12960627 - PreCommit-HIVE-Build > Predicate is not pushed to Parquet if > hive.parquet.timestamp.skip.conversion=true > --------------------------------------------------------------------------------- > > Key: HIVE-21327 > URL: https://issues.apache.org/jira/browse/HIVE-21327 > Project: Hive > Issue Type: Bug > Affects Versions: 4.0.0 > Reporter: Marta Kuczora > Assignee: Marta Kuczora > Priority: Major > Attachments: HIVE-21327.1.patch > > > The Parquet FilterPredicate is created and set to the configuration in the > ParquetRecordReaderBase.setFilter method. This method is used from the > ParquetRecordReaderWrapper constructor through the > ParquetRecordReaderBase.getSplit method and expects a JobConf as parameter > where it sets the created filter predicate. In the ParquetRecordReaderWrapper > constructor, multiple JobConf object is used: > {noformat} > jobConf = oldJobConf; > final ParquetInputSplit split = getSplit(oldSplit, jobConf); > TaskAttemptID taskAttemptID = > TaskAttemptID.forName(jobConf.get(IOConstants.MAPRED_TASK_ID)); > if (taskAttemptID == null) { > taskAttemptID = new TaskAttemptID(); > } > // create a TaskInputOutputContext > Configuration conf = jobConf; > if (skipTimestampConversion ^ HiveConf.getBoolVar( > conf, HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION)) { > conf = new JobConf(oldJobConf); > HiveConf.setBoolVar(conf, > HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION, > skipTimestampConversion); > } > final TaskAttemptContext taskContext = > ContextUtil.newTaskAttemptContext(conf, taskAttemptID); > {noformat} > So we have the jobConf, oldJobConf and conf objects and the getSplit is > called with the jobConf object, so the filter predicate will be set into this > config object. Based on this code part, the jobConf and oldJobConf should be > the same reference inside the if statement, so the newly created conf should > also contain the filter predicate. However in the getSplit method the value > of the jobConf is changed by the projectionPusher.pushProjectionsAndFilters > method, so inside the if statement, the jobConf and the oldJobConf are > actually different references. The filter predicate is set in the jobConf, > but if the if condition is true, the conf will be created from the oldJobConf > so it won't contain the filter predicate. > Just for reference, this behavior was introduced in > [HIVE-9873|https://issues.apache.org/jira/browse/HIVE-9873]. > Since the goal of the if statement is only to update the > HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION property in the configuration, it > should be using the jobConf where the filter predicate is correctly set. -- This message was sent by Atlassian JIRA (v7.6.3#76005)