[ 
https://issues.apache.org/jira/browse/HIVE-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076012#comment-14076012
 ] 

Navis commented on HIVE-1986:
-----------------------------

use to_unix_timestamp() instead of unix_timestamp()

> partition pruner do not take effect for non-deterministic UDF
> -------------------------------------------------------------
>
>                 Key: HIVE-1986
>                 URL: https://issues.apache.org/jira/browse/HIVE-1986
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.4.1, 0.5.0, 0.6.0, 0.7.0
>         Environment: trunk-src,hive default configure
>            Reporter: zhaowei
>             Fix For: 0.11.0
>
>
> hive udf can be deterministic or non-deterministic,but for non-deterministic 
> udf such as rand and unix_timestamp,ppr do not take effect.
> and for unix_timestamp with para, for example unix_timestamp('2010-01-01'),I 
> think it is deterministic.
> case :
> hive -hiveconf hive.root.logger=DEBUG,console
> create kv_part(key int,value string) partitioned by(ds string);
> alter table kv_part add partition (ds=2010) partition (ds=2011) partition 
> (ds=2012);
> create kv2(key int,value string) partitioned by(ds string);
> alter table kv2 add partition (ds=2013) partition (ds=2014) partition 
> (ds=2015);
> explain select * from kv_part join kv2 on(kv_part.key=kv2.key) where 
> kv_part.ds=2011 and rand() > 0.5
> rand() is non-deterministic ,so kv_part.ds=2011 no not filter the partition 
> ds=2010,ds=2012
> .....
> 11/02/14 12:22:32 DEBUG lazy.LazySimpleSerDe: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: 
> columnNames=[key, value] columnTypes=[int, string] separator=[[B@1ac9683] 
> nullstring=\N lastColumnTakesRest=false
> 11/02/14 12:22:32 INFO hive.log: DDL: struct kv_part { i32 key, string value}
> 11/02/14 12:22:32 DEBUG optimizer.GenMapRedUtils: Information added for path 
> hdfs://172.25.38.253:54310/user/hive/warehouse/kv_part/ds=2010
> 11/02/14 12:22:32 DEBUG optimizer.GenMapRedUtils: Information added for path 
> hdfs://172.25.38.253:54310/user/hive/warehouse/kv_part/ds=2011
> 11/02/14 12:22:32 DEBUG optimizer.GenMapRedUtils: Information added for path 
> hdfs://172.25.38.253:54310/user/hive/warehouse/kv_part/ds=2012
> 11/02/14 12:22:32 INFO parse.SemanticAnalyzer: Completed plan generation
> .....
> explain select * from kv_part join kv2 on(kv_part.key=kv2.key) where 
> kv_part.ds=2011 and sin(kv2.key) < 0.5;
> sin() is deterministic,so ppr work ok
> .....
> 11/02/14 12:25:22 DEBUG optimizer.GenMapRedUtils: Information added for path 
> hdfs://172.25.38.253:54310/user/hive/warehouse/kv_part/ds=2011
> ....
> And user should get the deterministic info for UDF from wiki,or we shoud add 
> this info to describe function



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to