Alexandre Fouché created HIVE-3919: --------------------------------------
Summary: Partition query refused when PPD is disabled (due to HIVE-3699) Key: HIVE-3919 URL: https://issues.apache.org/jira/browse/HIVE-3919 Project: Hive Issue Type: Bug Components: Query Processor Environment: Cloudera 4.1 on Amazon Linux (rebranded Centos 6): hive-0.9.0+150-1.cdh4.1.1.p0.4.el6.noarch Reporter: Alexandre Fouché The query below was working perfectly when hive.optimize.ppd=true. But i had to disable this property because my results are impacted by unresolved HIVE-3699. But now that hive.optimize.ppd=false, the query from table1 partitionned by field 'day' is refused, even if the query has the partition predicate. >From HIVE-830 it says that the partition pruner depends on it, so that may >explain why my queries are refused when Predictive Push Down is disabled. The query is accepted and run if i set hive.mapred.mode=nonstrict. But from Hive-server2 logs and input path lists and also processing, all dataset regardless of partitions is deserialized ! set hive.optimize.ppd = true; set hive.mapred.mode=strict; FROM table1 INSERT OVERWRITE TABLE general1 SELECT qs_cs_s_kid, day, qs_cs_s_aid, COUNT(*) WHERE qs_cs_s_cat='PRINT' AND (day>=20121015 and day<=20130103) GROUP BY qs_cs_s_kid, day, qs_cs_s_aid; -> Parsed fine, but wrong results when i add multiple 'insert overwrite' set hive.optimize.ppd = false; set hive.mapred.mode=strict; FROM table1 INSERT OVERWRITE TABLE general1 SELECT qs_cs_s_kid, day, qs_cs_s_aid, COUNT(*) WHERE qs_cs_s_cat='PRINT' AND (day>=20121015 and day<=20130103) GROUP BY qs_cs_s_kid, day, qs_cs_s_aid; Error: Error while processing statement: FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "table1" Table "table1" (state=42000,code=10041) Error: Error while processing statement: FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "table1" Table "table1" (state=42000,code=10041) The table 'table1' was created like this, it is partitionned by 'day': CREATE TABLE table1 PARTITIONED BY (day BIGINT) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ('avro.schema.literal'='{ "type": "record", "name": "Log", "fields" : [ {"name": "time_stamp", "type": ["null", "int"], "default": null}, (...) ] }') ; set hive.optimize.ppd = false; set hive.mapred.mode=strict; FROM table1 INSERT OVERWRITE TABLE general1 SELECT qs_cs_s_kid, day, qs_cs_s_aid, COUNT(*) WHERE qs_cs_s_cat='PRINT' AND day=20121015 GROUP BY qs_cs_s_kid, day, qs_cs_s_aid; -> FAILS TOO set hive.optimize.ppd = false; INSERT OVERWRITE TABLE general1 SELECT qs_cs_s_kid, day, qs_cs_s_aid, COUNT(*) FROM table1 WHERE qs_cs_s_cat='PRINT' AND day=20121015 GROUP BY qs_cs_s_kid, day, qs_cs_s_aid; -> FAILS TOO set hive.optimize.ppd = false; set hive.mapred.mode=nonstrict; FROM table1 INSERT OVERWRITE TABLE general1 SELECT qs_cs_s_kid, day, qs_cs_s_aid, COUNT(*) WHERE qs_cs_s_cat='PRINT' AND day=20121015 GROUP BY qs_cs_s_kid, day, qs_cs_s_aid; -> WORKS, but from Hive-server2 logs and input path lists and also processing, all dataset regardless of partitions is deserialized ! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira