Re: UDFs with package names
Yup, it was the directory structure com/mystuff/whateverUDF.class that was missing. Thought I had tried that before posting my question, but... Thanks for your help! From: Edward Capriolo edlinuxg...@gmail.com To: user@hive.apache.org user@hive.apache.org; Michael Malak michaelma...@yahoo.com Sent: Tuesday, July 30, 2013 7:06 PM Subject: Re: UDFs with package names It might be a better idea to use your own package com.mystuff.x. You might be running into an issue where java is not finding the file because it assumes the relation between package and jar is 1 to 1. You might also be compiling wrong If your package is com.mystuff that class file should be in a directory structure com/mystuff/whateverUDF.class I am not seeing that from your example. On Tue, Jul 30, 2013 at 8:00 PM, Michael Malak michaelma...@yahoo.com wrote: Thus far, I've been able to create Hive UDFs, but now I need to define them within a Java package name (as opposed to the default Java package as I had been doing), but once I do that, I'm no longer able to load them into Hive. First off, this works: add jar /usr/lib/hive/lib/hive-contrib-0.10.0-cdh4.3.0.jar; create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence'; Then I took the source code for UDFRowSequence.java from http://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java and renamed the file and the class inside to UDFRowSequence2.java I compile and deploy it with: javac -cp /usr/lib/hive/lib/hive-exec-0.10.0-cdh4.3.0.jar:/usr/lib/hadoop/hadoop-common.jar UDFRowSequence2.java jar cvf UDFRowSequence2.jar UDFRowSequence2.class sudo cp UDFRowSequence2.jar /usr/local/lib But in Hive, I get the following: hive add jar /usr/local/lib/UDFRowSequence2.jar; Added /usr/local/lib/UDFRowSequence2.jar to class path Added resource: /usr/local/lib/UDFRowSequence2.jar hive create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence2'; FAILED: Class org.apache.hadoop.hive.contrib.udf.UDFRowSequence2 not found FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask But if I comment out the package line in UDFRowSequence2.java (to put the UDF into the default Java package), it works: hive add jar /usr/local/lib/UDFRowSequence2.jar; Added /usr/local/lib/UDFRowSequence2.jar to class path Added resource: /usr/local/lib/UDFRowSequence2.jar hive create temporary function row_sequence as 'UDFRowSequence2'; OK Time taken: 0.383 seconds What am I doing wrong? I have a feeling it's something simple.
UDFs with package names
Thus far, I've been able to create Hive UDFs, but now I need to define them within a Java package name (as opposed to the default Java package as I had been doing), but once I do that, I'm no longer able to load them into Hive. First off, this works: add jar /usr/lib/hive/lib/hive-contrib-0.10.0-cdh4.3.0.jar; create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence'; Then I took the source code for UDFRowSequence.java from http://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java and renamed the file and the class inside to UDFRowSequence2.java I compile and deploy it with: javac -cp /usr/lib/hive/lib/hive-exec-0.10.0-cdh4.3.0.jar:/usr/lib/hadoop/hadoop-common.jar UDFRowSequence2.java jar cvf UDFRowSequence2.jar UDFRowSequence2.class sudo cp UDFRowSequence2.jar /usr/local/lib But in Hive, I get the following: hive add jar /usr/local/lib/UDFRowSequence2.jar; Added /usr/local/lib/UDFRowSequence2.jar to class path Added resource: /usr/local/lib/UDFRowSequence2.jar hive create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence2'; FAILED: Class org.apache.hadoop.hive.contrib.udf.UDFRowSequence2 not found FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask But if I comment out the package line in UDFRowSequence2.java (to put the UDF into the default Java package), it works: hive add jar /usr/local/lib/UDFRowSequence2.jar; Added /usr/local/lib/UDFRowSequence2.jar to class path Added resource: /usr/local/lib/UDFRowSequence2.jar hive create temporary function row_sequence as 'UDFRowSequence2'; OK Time taken: 0.383 seconds What am I doing wrong? I have a feeling it's something simple.
Re: UDFs with package names
It might be a better idea to use your own package com.mystuff.x. You might be running into an issue where java is not finding the file because it assumes the relation between package and jar is 1 to 1. You might also be compiling wrong If your package is com.mystuff that class file should be in a directory structure com/mystuff/whateverUDF.class I am not seeing that from your example. On Tue, Jul 30, 2013 at 8:00 PM, Michael Malak michaelma...@yahoo.comwrote: Thus far, I've been able to create Hive UDFs, but now I need to define them within a Java package name (as opposed to the default Java package as I had been doing), but once I do that, I'm no longer able to load them into Hive. First off, this works: add jar /usr/lib/hive/lib/hive-contrib-0.10.0-cdh4.3.0.jar; create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence'; Then I took the source code for UDFRowSequence.java from http://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java and renamed the file and the class inside to UDFRowSequence2.java I compile and deploy it with: javac -cp /usr/lib/hive/lib/hive-exec-0.10.0-cdh4.3.0.jar:/usr/lib/hadoop/hadoop-common.jar UDFRowSequence2.java jar cvf UDFRowSequence2.jar UDFRowSequence2.class sudo cp UDFRowSequence2.jar /usr/local/lib But in Hive, I get the following: hive add jar /usr/local/lib/UDFRowSequence2.jar; Added /usr/local/lib/UDFRowSequence2.jar to class path Added resource: /usr/local/lib/UDFRowSequence2.jar hive create temporary function row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence2'; FAILED: Class org.apache.hadoop.hive.contrib.udf.UDFRowSequence2 not found FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask But if I comment out the package line in UDFRowSequence2.java (to put the UDF into the default Java package), it works: hive add jar /usr/local/lib/UDFRowSequence2.jar; Added /usr/local/lib/UDFRowSequence2.jar to class path Added resource: /usr/local/lib/UDFRowSequence2.jar hive create temporary function row_sequence as 'UDFRowSequence2'; OK Time taken: 0.383 seconds What am I doing wrong? I have a feeling it's something simple.