[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12916956#action_12916956 ] Philip Zeyliger commented on HIVE-1157: --- Namit, Thanks for the review. I've fixed the test failures. The one you pointed out was a missing log line from the results. And there was a second one having to do with relative paths. Oddly enough, however, when I tried to bring the changes up to current trunk, it turned out that HIVE-1624 conflicted, and, when I looked at it, it turns out to supply the same feature as this patch. I'll upload the fixed patch for posterity, but it looks like this issue is no longer necessary. -- Philip UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.v2.patch.txt, output.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12915469#action_12915469 ] Namit Jain commented on HIVE-1157: -- This is good to have - I will take a look UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.v2.patch.txt, output.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12915619#action_12915619 ] Namit Jain commented on HIVE-1157: -- The changes looked good, but I got the following error: [junit] Begin query: alter1.q [junit] diff -a -I file: -I pfile: -I hdfs: -I /tmp/ -I invalidscheme: -I lastUpdateTime -I lastAccessTime -I [Oo]wner -I CreateTime -I LastAccessTime -I Location -I transient_lastDdlTime -I last_modified_ -I java.lang.RuntimeException -I at org -I at sun -I at java -I at junit -I Caused by: -I [.][.][.] [0-9]* more /data/users/njain/hive_commit2/hive_commit2/build/ql/test/logs/clientpositive/alter1.q.out /data/users/njain/hive_commit2/hive_commit2/ql/src/test/results/clientpositive/alter1.q.out [junit] 778d777 [junit] Resource ../data/files/TestSerDe.jar already added. Philip, can you take care of that ? UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.v2.patch.txt, output.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12902208#action_12902208 ] Carl Steinbach commented on HIVE-1157: -- Hi Philip, please rebase the patch and I will take a look. Thanks. UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.v2.patch.txt, output.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
Philip: hive add jar hdfs://localhost/FooTest.jar; Unable to validate hdfs://localhost/FooTest.jar Exception: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused Do you know how the port (8020) is configured for 'add jar' command ? Thanks On Sat, Mar 27, 2010 at 9:04 PM, Philip Zeyliger (JIRA) j...@apache.orgwrote: [ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850628#action_12850628] Philip Zeyliger commented on HIVE-1157: --- Edward, I'm having trouble reproducing the error you're seeing. {quote} create temporary function geoip as 'com.jointhegrid.hive.udf.GenericUDFGeoIP'; hive select geoip(theIp ,'COUNTRY_NAME', './GeoLiteCity.dat.gz' ) from ip ; java.lang.ClassNotFoundException: com.jointhegrid.hive.udf.GenericUDFGeoIP Continuing ... {quote} On my machine, if I create temporary function with a class name that doesn't exist, it fails. So it makes no sense to me that create temporary function is succeeding, but then it's immediately not finding it. Do you have any theories on what's going on? Can you try to run it with debug on? Thanks! UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.v2.patch.txt, output.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850628#action_12850628 ] Philip Zeyliger commented on HIVE-1157: --- Edward, I'm having trouble reproducing the error you're seeing. {quote} create temporary function geoip as 'com.jointhegrid.hive.udf.GenericUDFGeoIP'; hive select geoip(theIp ,'COUNTRY_NAME', './GeoLiteCity.dat.gz' ) from ip ; java.lang.ClassNotFoundException: com.jointhegrid.hive.udf.GenericUDFGeoIP Continuing ... {quote} On my machine, if I create temporary function with a class name that doesn't exist, it fails. So it makes no sense to me that create temporary function is succeeding, but then it's immediately not finding it. Do you have any theories on what's going on? Can you try to run it with debug on? Thanks! UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.v2.patch.txt, output.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12848790#action_12848790 ] Edward Capriolo commented on HIVE-1157: --- Phillip, 1) You are generating your patch from the ql subdirectory. For the final commit you have to generate it from the build root. 2) I tried this on a cluster with a local job tracker, and local (psudeo-distributed) running namenode and datanode. It did not run. I am attaching the output. UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, HIVE-1157.v2.patch.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12848242#action_12848242 ] Edward Capriolo commented on HIVE-1157: --- Looking at this now. One quick thing is you generated the patch from ql not from the trunk so you need to regenerate. UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.v2.patch.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12848261#action_12848261 ] Edward Capriolo commented on HIVE-1157: --- {noformat} [edw...@ec hive]$ ant -Dtestcase=TestAddJarFromHDFS test testcase classname=org.apache.hadoop.hive.ql.session.TestAddJarFromHDFS name=testAddJarFromHDFS time=6.73 error type=java.lang.NullPointerExceptionjava.lang.NullPointerException at org.apache.hadoop.hive.ql.session.SessionState$JarResourceHook.preHook(SessionState.java:391) at org.apache.hadoop.hive.ql.session.SessionState.add_resource(SessionState.java:474) at org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:52) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:173) at org.apache.hadoop.hive.ql.session.TestAddJarFromHDFS.testAddJarFromHDFS(TestAddJarFromHDFS.java:71) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:154) at junit.framework.TestCase.runBare(TestCase.java:127) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785) /error /testcase system-out![CDATA[Starting DataNode 0 with dfs.data.dir: build/test/data/dfs/data/data1,build/test/data/dfs/data/data2 ]]/system-out system-err![CDATA[Waiting for the Mini HDFS Cluster to start... ]]/system-err /testsuite {noformat} {noformat} String jarURI = fs.getUri().toString() + /addJarFromHdfs.jar; int ret = cliDriver.processCmd(ADD JAR + jarURI); {noformat} This is a clean checkout and build. I will try to trace this down more. UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt, HIVE-1157.v2.patch.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841987#action_12841987 ] Philip Zeyliger commented on HIVE-1157: --- Has anyone had a chance to look at this? Would appreciate the feedback! Thanks! UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12842013#action_12842013 ] Edward Capriolo commented on HIVE-1157: --- Philip, I will apply test the code tonight. In the mean time I do not see a unit test .q file. Since the test target happens after the build target you can possibly bundle up CubeSampleUDF into a jar file and write a .q file Then you can put your test code in a .q file. You can take a look at : *contrib/src/test/queries/clientpositive/udf_example_add.q *contrib/src/test/queries/clientpositive/dboutput.q These both show how to test a UDF that is not hard coded into the FunctionRegistry UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840332#action_12840332 ] Edward Capriolo commented on HIVE-1157: --- Very cool. I will take a look at this. Even though the Hive Test Cases are abstracted, I believe you can use the DFS/ MiniMRCluster. Since the hive class path inherits the Hadoop one. If i understand your problem you might be able to do this: {noformat} dfs -put your.jar /wherever add jar hdfs://localhost:8020/wherever/your.jar; {noformat} UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor Attachments: hive-1157.patch.txt As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832554#action_12832554 ] Edward Capriolo commented on HIVE-1157: --- Sorry about that. I am not sure if I had an incomplete thought, or I cut half my message. In any case, I like your idea of bringing jars into HDFS. The fact that the jar file has to live on the local filesystem where the job is launched from is very constraining. You can not leverage your Distributed File System. The same can be said for SerDe's. Maybe these files should live on HDFS in the warehouse directory somehow. +1 on your thinking UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832381#action_12832381 ] Edward Capriolo commented on HIVE-1157: --- Removing local file dependencies is much cleaner. UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1157) UDFs can't be loaded via add jar when jar is on HDFS
[ https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832389#action_12832389 ] Philip Zeyliger commented on HIVE-1157: --- Edward, I'm not sure what you mean. -- Philip UDFs can't be loaded via add jar when jar is on HDFS -- Key: HIVE-1157 URL: https://issues.apache.org/jira/browse/HIVE-1157 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Philip Zeyliger Priority: Minor As discussed on the mailing list, it would be nice if you could use UDFs that are on jars on HDFS. The proposed implementation would be for add jar to recognize that the target file is on HDFS, copy it locally, and load it into the classpath. {quote} Hi folks, I have a quick question about UDF support in Hive. I'm on the 0.5 branch. Can you use a UDF where the jar which contains the function is on HDFS, and not on the local filesystem. Specifically, the following does not seem to work: # This is Hive 0.5, from svn $bin/hive Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt hive add jar hdfs://localhost/FooTest.jar; Added hdfs://localhost/FooTest.jar to class path hive create temporary function cube as 'com.cloudera.FooTestUDF'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask Does this work for other people? I could probably fix it by changing add jar to download remote jars locally, when necessary (to load them into the classpath), or update URLClassLoader (or whatever is underneath there) to read directly from HDFS, which seems a bit more fragile. But I wanted to make sure that my interpretation of what's going on is right before I have at it. Thanks, -- Philip {quote} {quote} Yes that's correct. I prefer to download the jars in add jar. Zheng {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.