Hayok created HIVE-7644: --------------------------- Summary: hive custom udf cannot be used in the join_condition(on) Key: HIVE-7644 URL: https://issues.apache.org/jira/browse/HIVE-7644 Project: Hive Issue Type: Bug Components: Clients Affects Versions: 0.12.0 Reporter: Hayok
hive> ADD JAR xxxxx; Added xxxxx to class path Added resource: xxxxx hive> create temporary function func1 as 'xxx'; OK Time taken: 0.009 seconds hive> list jars; xxx.jar hive> select /*+ MAPJOIN(certain column1) */ > * > from tb1 > join tb2 on tb1.column2 = func1(tb2.column3) > ; Total MapReduce jobs = 1 14/08/07 17:38:04 WARN conf.Configuration: file:/tmp/[username]hive_2014-08-07_17-38-01_048_6199454015323812186-1/-local-10005/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/08/07 17:38:04 WARN conf.Configuration: file:/tmp/[username]/hive_2014-08-07_17-38-01_048_6199454015323812186-1/-local-10005/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/08/07 17:38:05 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. Execution log at: /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log 2014-08-07 05:38:05 Starting to launch local task to process map join; maximum memory = 2027290624 Execution failed with exit status: 2 Obtaining error information Task failed! Task ID: Stage-4 Logs: /tmp/[username]/hive.log FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask Then I watch the log named /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log, it writes: 2014-08-07 16:46:59,105 INFO mr.MapredLocalTask (SessionState.java:printInfo(417)) - 2014-08-07 04:46:59 Starting to launch local task to process map join; maximum memory = 2027290624 2014-08-07 16:46:59,114 INFO mr.MapredLocalTask (MapredLocalTask.java:initializeOperators(389)) - fetchoperator for tmp_compete created 2014-08-07 16:46:59,196 INFO exec.TableScanOperator (Operator.java:initialize(338)) - Initializing Self 0 TS 2014-08-07 16:46:59,197 INFO exec.TableScanOperator (Operator.java:initializeChildren(403)) - Operator 0 TS initialized 2014-08-07 16:46:59,197 INFO exec.TableScanOperator (Operator.java:initializeChildren(407)) - Initializing children of 0 TS 2014-08-07 16:46:59,197 INFO exec.HashTableSinkOperator (Operator.java:initialize(442)) - Initializing child 1 HASHTABLESINK 2014-08-07 16:46:59,197 INFO exec.HashTableSinkOperator (Operator.java:initialize(338)) - Initializing Self 1 HASHTABLESINK 2014-08-07 16:46:59,198 INFO mapjoin.MapJoinMemoryExhaustionHandler (MapJoinMemoryExhaustionHandler.java:<init>(72)) - JVM Max Heap Size: 2027290624 2014-08-07 16:46:59,222 ERROR mr.MapredLocalTask (MapredLocalTask.java:executeFromChildJVM(324)) - Hive Runtime Error: Map local work failed org.apache.hadoop.hive.ql.exec.UDFArgumentException: The UDF implementation class 'xxx' is not present in the class path at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:142) at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:116) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:127) at org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:66) at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:140) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:453) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:409) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:188) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:408) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:302) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:728) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) ------------------------------------------------------------------------------------ I ensure there is no authorization problem with it,and when the udf is not in the join-condition such as select udf(column_name) or where udf(column_name) it works good. Anyone else encountered the problem? -- This message was sent by Atlassian JIRA (v6.2#6252)