[ https://issues.apache.org/jira/browse/HIVE-15339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15719902#comment-15719902 ]
Hive QA commented on HIVE-15339: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12841454/HIVE-15339.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10761 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables] (batchId=77) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=92) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2407/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2407/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2407/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12841454 - PreCommit-HIVE-Build > Prefetch column stats for fields needed in FilterSelectivityEstimator > --------------------------------------------------------------------- > > Key: HIVE-15339 > URL: https://issues.apache.org/jira/browse/HIVE-15339 > Project: Hive > Issue Type: Improvement > Reporter: Rajesh Balamohan > Priority: Minor > Attachments: HIVE-15339.1.patch > > > Based on query pattern, {{FilterSelectivityEstimator}} gets column statistics > from metastore in multiple calls. For instance, in the following query, it > ends up getting individual column statistics for for flights multiple number > of times. > When the table has large number of partitions, getting statistics for columns > via multiple calls can be very expensive. This would adversely impact the > overall compilation time. The following query took 14 seconds to compile. > {noformat} > SELECT COUNT(`flights`.`flightnum`) AS `cnt_flightnum_ok`, > YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok` > FROM `flights` as `flights` > JOIN `airlines` ON (`flights`.`uniquecarrier` = `airlines`.`code`) > JOIN `airports` as `source_airport` ON (`flights`.`origin` = > `source_airport`.`iata`) > JOIN `airports` as `dest_airport` ON (`flights`.`dest` = > `dest_airport`.`iata`) > GROUP BY YEAR(`flights`.`dateofflight`); > {noformat} > It may be helpful to club all columns that need statistics and fetch these > details in single remote call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)