> WHERE p IN (SELECT p FROM t2)
> here we could argue that Hive could optimize this by computing the sub >query first, > and then do the partition pruning, but sadly I don't think this >optimisation has been implemented yet It is implemented already - <https://issues.apache.org/jira/browse/HIVE-7826> In Hive-1.x, the optimization doesn't kick in when the partition column has a UDF wrapped around it. In Hive-2.0, it does apply even if the partition column is wrapped with a UDF. "explain rewrite .... where p IN (Select p from t2);" will show the rewrite which enables DPP. > An example of non-deterministic function are rand() and unix_timestamp() >because it is evaluated differently at each row Yes, that is exactly right. Another case was TO_DATE() which in Hive-1.x returned Strings and prevented the removal of partitions. Cheers, Gopal