[jira] Commented: (HIVE-1692) FetchOperator.getInputFormatFromCache hides causal exception

2010-10-05 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918422#action_12918422
 ] 

Philip Zeyliger commented on HIVE-1692:
---

BTW, to illustrate what a difference 3 characters make, compare debugging the 
following two errors:

(no patch)
{noformat}
10/10/05 21:55:39 ERROR CliDriver: Failed with exception 
java.io.IOException:java.io.IOException: Cannot create an instance of 
InputFormat class org.apache.hadoop.mapred.TextInputFormat as specified in 
mapredWork!
java.io.IOException: java.io.IOException: Cannot create an instance of 
InputFormat class org.apache.hadoop.mapred.TextInputFormat as specified in 
mapredWork!
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:271)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:113)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:657)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:131)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.io.IOException: Cannot create an instance of InputFormat class 
org.apache.hadoop.mapred.TextInputFormat as specified in mapredWork!
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getInputFormatFromCache(FetchOperator.java:113)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:214)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:250)
... 10 more
{noformat}

(patch)
{noformat}

10/10/05 21:54:03 ERROR CliDriver: Failed with exception 
java.io.IOException:java.io.IOException: Cannot create an instance of 
InputFormat class org.apache.hadoop.mapred.TextInputFormat as specified in 
mapredWork!
java.io.IOException: java.io.IOException: Cannot create an instance of 
InputFormat class org.apache.hadoop.mapred.TextInputFormat as specified in 
mapredWork!
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:271)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:113)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:657)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:131)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.io.IOException: Cannot create an instance of InputFormat class 
org.apache.hadoop.mapred.TextInputFormat as specified in mapredWork!
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getInputFormatFromCache(FetchOperator.java:113)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:214)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:250)
... 10 more
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getInputFormatFromCache(FetchOperator.java:109)
... 12 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 15 more
Caused by: java.lang.IllegalArgumentException: Compression codec 
com.hadoop.compression.lzo.LzoCodec not found.
at 
org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96)
at 
org.apache.hadoop.io.com

[jira] Updated: (HIVE-1692) FetchOperator.getInputFormatFromCache hides causal exception

2010-10-05 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated HIVE-1692:
--

Attachment: HIVE-1692.patch.txt

I'll spare folks downloading the attachment:

{noformat}
-+ inputFormatClass.getName() + " as specified in mapredWork!");
++ inputFormatClass.getName() + " as specified in mapredWork!", e);
{noformat}

> FetchOperator.getInputFormatFromCache hides causal exception
> 
>
> Key: HIVE-1692
> URL: https://issues.apache.org/jira/browse/HIVE-1692
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.7.0
>Reporter: Philip Zeyliger
>Priority: Minor
> Fix For: 0.7.0
>
> Attachments: HIVE-1692.patch.txt
>
>
> There's a line in FetchOperator.getInputFormatFromCache that catches all 
> exceptions and re-throws IOException instead, hiding the original cause.  I 
> ran into this, naturally, and wish to fix it.  Patch below is trivial.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1692) FetchOperator.getInputFormatFromCache hides causal exception

2010-10-05 Thread Philip Zeyliger (JIRA)
FetchOperator.getInputFormatFromCache hides causal exception


 Key: HIVE-1692
 URL: https://issues.apache.org/jira/browse/HIVE-1692
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.7.0
Reporter: Philip Zeyliger
Priority: Minor
 Fix For: 0.7.0


There's a line in FetchOperator.getInputFormatFromCache that catches all 
exceptions and re-throws IOException instead, hiding the original cause.  I ran 
into this, naturally, and wish to fix it.  Patch below is trivial.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-10-01 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated HIVE-1157:
--

Attachment: HIVE-1157.patch.v6.txt

> UDFs can't be loaded via "add jar" when jar is on HDFS
> --
>
> Key: HIVE-1157
> URL: https://issues.apache.org/jira/browse/HIVE-1157
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Philip Zeyliger
>Priority: Minor
> Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, 
> HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.patch.v6.txt, 
> HIVE-1157.v2.patch.txt, output.txt
>
>
> As discussed on the mailing list, it would be nice if you could use UDFs that 
> are on jars on HDFS.  The proposed implementation would be for "add jar" to 
> recognize that the target file is on HDFS, copy it locally, and load it into 
> the classpath.
> {quote}
> Hi folks,
> I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
> Can you use a UDF where the jar which contains the function is on HDFS, and 
> not on the local filesystem.  Specifically, the following does not seem to 
> work:
> # This is Hive 0.5, from svn
> $bin/hive  
> Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
> hive> add jar hdfs://localhost/FooTest.jar;   
>
> Added hdfs://localhost/FooTest.jar to class path
> hive> create temporary function cube as 'com.cloudera.FooTestUDF';
> 
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> Does this work for other people?  I could probably fix it by changing "add 
> jar" to download remote jars locally, when necessary (to load them into the 
> classpath), or update URLClassLoader (or whatever is underneath there) to 
> read directly from HDFS, which seems a bit more fragile.  But I wanted to 
> make sure that my interpretation of what's going on is right before I have at 
> it.
> Thanks,
> -- Philip
> {quote}
> {quote}
> Yes that's correct. I prefer to download the jars in "add jar".
> Zheng
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-10-01 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916956#action_12916956
 ] 

Philip Zeyliger commented on HIVE-1157:
---

Namit,

Thanks for the review.  I've fixed the test failures.  The one you pointed out 
was a missing log line from the results.  And there was a second one having to 
do with relative paths.

Oddly enough, however, when I tried to bring the changes up to current trunk, 
it turned out that HIVE-1624 conflicted, and, when I looked at it, it turns out 
to supply the same feature as this patch.  I'll upload the fixed patch for 
posterity, but it looks like this issue is no longer necessary.

-- Philip

> UDFs can't be loaded via "add jar" when jar is on HDFS
> --
>
> Key: HIVE-1157
> URL: https://issues.apache.org/jira/browse/HIVE-1157
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Philip Zeyliger
>Priority: Minor
> Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, 
> HIVE-1157.patch.v4.txt, HIVE-1157.patch.v5.txt, HIVE-1157.v2.patch.txt, 
> output.txt
>
>
> As discussed on the mailing list, it would be nice if you could use UDFs that 
> are on jars on HDFS.  The proposed implementation would be for "add jar" to 
> recognize that the target file is on HDFS, copy it locally, and load it into 
> the classpath.
> {quote}
> Hi folks,
> I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
> Can you use a UDF where the jar which contains the function is on HDFS, and 
> not on the local filesystem.  Specifically, the following does not seem to 
> work:
> # This is Hive 0.5, from svn
> $bin/hive  
> Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
> hive> add jar hdfs://localhost/FooTest.jar;   
>
> Added hdfs://localhost/FooTest.jar to class path
> hive> create temporary function cube as 'com.cloudera.FooTestUDF';
> 
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> Does this work for other people?  I could probably fix it by changing "add 
> jar" to download remote jars locally, when necessary (to load them into the 
> classpath), or update URLClassLoader (or whatever is underneath there) to 
> read directly from HDFS, which seems a bit more fragile.  But I wanted to 
> make sure that my interpretation of what's going on is right before I have at 
> it.
> Thanks,
> -- Philip
> {quote}
> {quote}
> Yes that's correct. I prefer to download the jars in "add jar".
> Zheng
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-09-27 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated HIVE-1157:
--

Attachment: HIVE-1157.patch.v4.txt

Carl,

I updated the patch to current trunk.

> UDFs can't be loaded via "add jar" when jar is on HDFS
> --
>
> Key: HIVE-1157
> URL: https://issues.apache.org/jira/browse/HIVE-1157
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Philip Zeyliger
>Priority: Minor
> Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, 
> HIVE-1157.patch.v4.txt, HIVE-1157.v2.patch.txt, output.txt
>
>
> As discussed on the mailing list, it would be nice if you could use UDFs that 
> are on jars on HDFS.  The proposed implementation would be for "add jar" to 
> recognize that the target file is on HDFS, copy it locally, and load it into 
> the classpath.
> {quote}
> Hi folks,
> I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
> Can you use a UDF where the jar which contains the function is on HDFS, and 
> not on the local filesystem.  Specifically, the following does not seem to 
> work:
> # This is Hive 0.5, from svn
> $bin/hive  
> Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
> hive> add jar hdfs://localhost/FooTest.jar;   
>
> Added hdfs://localhost/FooTest.jar to class path
> hive> create temporary function cube as 'com.cloudera.FooTestUDF';
> 
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> Does this work for other people?  I could probably fix it by changing "add 
> jar" to download remote jars locally, when necessary (to load them into the 
> classpath), or update URLClassLoader (or whatever is underneath there) to 
> read directly from HDFS, which seems a bit more fragile.  But I wanted to 
> make sure that my interpretation of what's going on is right before I have at 
> it.
> Thanks,
> -- Philip
> {quote}
> {quote}
> Yes that's correct. I prefer to download the jars in "add jar".
> Zheng
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1530) Include hive-default.xml and hive-log4j.properties in hive-common JAR

2010-09-24 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914528#action_12914528
 ] 

Philip Zeyliger commented on HIVE-1530:
---

+1.  I'm a big fan of this change.

We've repeatedly had customers using an old or weird hive-default or 
non-existent hive-default, and that's caused quite tricky to debug issues.

> Include hive-default.xml and hive-log4j.properties in hive-common JAR
> -
>
> Key: HIVE-1530
> URL: https://issues.apache.org/jira/browse/HIVE-1530
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Fix For: 0.7.0
>
> Attachments: HIVE-1530.1.patch.txt
>
>
> hive-common-*.jar should include hive-default.xml and hive-log4j.properties,
> and similarly hive-exec-*.jar should include hive-exec-log4j.properties. The
> hive-default.xml file that currently sits in the conf/ directory should be 
> removed.
> Motivations for this change:
> * We explicitly tell users that they should never modify hive-default.xml yet 
> give them the opportunity to do so by placing the file in the conf dir.
> * Many users are familiar with the Hadoop configuration mechanism that does 
> not require *-default.xml files to be present in the HADOOP_CONF_DIR, and 
> assume that the same is true for HIVE_CONF_DIR.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-03-27 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850628#action_12850628
 ] 

Philip Zeyliger commented on HIVE-1157:
---

Edward,

I'm having trouble reproducing the error you're seeing.

{quote}

create temporary function geoip as 'com.jointhegrid.hive.udf.GenericUDFGeoIP';

hive> select geoip(theIp ,'COUNTRY_NAME', './GeoLiteCity.dat.gz' ) from ip ; 
java.lang.ClassNotFoundException: com.jointhegrid.hive.udf.GenericUDFGeoIP
Continuing ...
{quote}

On my machine, if I create temporary function with a class name that doesn't 
exist, it fails.  So it makes no sense to me that "create temporary function" 
is succeeding, but then it's immediately not finding it.  Do you have any 
theories on what's going on?  Can you try to run it with debug on?

Thanks!

> UDFs can't be loaded via "add jar" when jar is on HDFS
> --
>
> Key: HIVE-1157
> URL: https://issues.apache.org/jira/browse/HIVE-1157
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Philip Zeyliger
>Priority: Minor
> Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, 
> HIVE-1157.v2.patch.txt, output.txt
>
>
> As discussed on the mailing list, it would be nice if you could use UDFs that 
> are on jars on HDFS.  The proposed implementation would be for "add jar" to 
> recognize that the target file is on HDFS, copy it locally, and load it into 
> the classpath.
> {quote}
> Hi folks,
> I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
> Can you use a UDF where the jar which contains the function is on HDFS, and 
> not on the local filesystem.  Specifically, the following does not seem to 
> work:
> # This is Hive 0.5, from svn
> $bin/hive  
> Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
> hive> add jar hdfs://localhost/FooTest.jar;   
>
> Added hdfs://localhost/FooTest.jar to class path
> hive> create temporary function cube as 'com.cloudera.FooTestUDF';
> 
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> Does this work for other people?  I could probably fix it by changing "add 
> jar" to download remote jars locally, when necessary (to load them into the 
> classpath), or update URLClassLoader (or whatever is underneath there) to 
> read directly from HDFS, which seems a bit more fragile.  But I wanted to 
> make sure that my interpretation of what's going on is right before I have at 
> it.
> Thanks,
> -- Philip
> {quote}
> {quote}
> Yes that's correct. I prefer to download the jars in "add jar".
> Zheng
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-03-22 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated HIVE-1157:
--

Attachment: HIVE-1157.patch.v3.txt

Ed,

Indeed, I've been able to reproduce that.  I traced it down to some bad error 
handling when scratch_dir doesn't exist.  The new patch creates a scratch dir 
if it doesn't already exist, and adds an if/else to make sure 
localFile.delete() isn't called if localFile is null.

Sorry about that.  I'm not sure whether something changed between when I 
created the patch and now on trunk to change how the scratchdir works, or if I 
had the scratch dir craeted by other tests in my local checkout.  Either way, 
this should fix it.

Thanks!

> UDFs can't be loaded via "add jar" when jar is on HDFS
> --
>
> Key: HIVE-1157
> URL: https://issues.apache.org/jira/browse/HIVE-1157
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Philip Zeyliger
>Priority: Minor
> Attachments: hive-1157.patch.txt, HIVE-1157.patch.v3.txt, 
> HIVE-1157.v2.patch.txt
>
>
> As discussed on the mailing list, it would be nice if you could use UDFs that 
> are on jars on HDFS.  The proposed implementation would be for "add jar" to 
> recognize that the target file is on HDFS, copy it locally, and load it into 
> the classpath.
> {quote}
> Hi folks,
> I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
> Can you use a UDF where the jar which contains the function is on HDFS, and 
> not on the local filesystem.  Specifically, the following does not seem to 
> work:
> # This is Hive 0.5, from svn
> $bin/hive  
> Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
> hive> add jar hdfs://localhost/FooTest.jar;   
>
> Added hdfs://localhost/FooTest.jar to class path
> hive> create temporary function cube as 'com.cloudera.FooTestUDF';
> 
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> Does this work for other people?  I could probably fix it by changing "add 
> jar" to download remote jars locally, when necessary (to load them into the 
> classpath), or update URLClassLoader (or whatever is underneath there) to 
> read directly from HDFS, which seems a bit more fragile.  But I wanted to 
> make sure that my interpretation of what's going on is right before I have at 
> it.
> Thanks,
> -- Philip
> {quote}
> {quote}
> Yes that's correct. I prefer to download the jars in "add jar".
> Zheng
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-03-22 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12848218#action_12848218
 ] 

Philip Zeyliger commented on HIVE-1157:
---

Anyone care to take a look?

> UDFs can't be loaded via "add jar" when jar is on HDFS
> --
>
> Key: HIVE-1157
> URL: https://issues.apache.org/jira/browse/HIVE-1157
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Philip Zeyliger
>Priority: Minor
> Attachments: hive-1157.patch.txt, HIVE-1157.v2.patch.txt
>
>
> As discussed on the mailing list, it would be nice if you could use UDFs that 
> are on jars on HDFS.  The proposed implementation would be for "add jar" to 
> recognize that the target file is on HDFS, copy it locally, and load it into 
> the classpath.
> {quote}
> Hi folks,
> I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
> Can you use a UDF where the jar which contains the function is on HDFS, and 
> not on the local filesystem.  Specifically, the following does not seem to 
> work:
> # This is Hive 0.5, from svn
> $bin/hive  
> Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
> hive> add jar hdfs://localhost/FooTest.jar;   
>
> Added hdfs://localhost/FooTest.jar to class path
> hive> create temporary function cube as 'com.cloudera.FooTestUDF';
> 
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> Does this work for other people?  I could probably fix it by changing "add 
> jar" to download remote jars locally, when necessary (to load them into the 
> classpath), or update URLClassLoader (or whatever is underneath there) to 
> read directly from HDFS, which seems a bit more fragile.  But I wanted to 
> make sure that my interpretation of what's going on is right before I have at 
> it.
> Thanks,
> -- Philip
> {quote}
> {quote}
> Yes that's correct. I prefer to download the jars in "add jar".
> Zheng
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-03-16 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated HIVE-1157:
--

Attachment: HIVE-1157.v2.patch.txt

I've uploaded a new patch with a bug fix (wasn't unregistering the jars 
correctly) and with a test.

The test starts a MiniDFSCluster and runs add jar and delete jar explicitly, 
without using the ".q" framework.  I felt this was the best way to test just 
the new behavior.

> UDFs can't be loaded via "add jar" when jar is on HDFS
> --
>
> Key: HIVE-1157
> URL: https://issues.apache.org/jira/browse/HIVE-1157
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Philip Zeyliger
>Priority: Minor
> Attachments: hive-1157.patch.txt, HIVE-1157.v2.patch.txt
>
>
> As discussed on the mailing list, it would be nice if you could use UDFs that 
> are on jars on HDFS.  The proposed implementation would be for "add jar" to 
> recognize that the target file is on HDFS, copy it locally, and load it into 
> the classpath.
> {quote}
> Hi folks,
> I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
> Can you use a UDF where the jar which contains the function is on HDFS, and 
> not on the local filesystem.  Specifically, the following does not seem to 
> work:
> # This is Hive 0.5, from svn
> $bin/hive  
> Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
> hive> add jar hdfs://localhost/FooTest.jar;   
>
> Added hdfs://localhost/FooTest.jar to class path
> hive> create temporary function cube as 'com.cloudera.FooTestUDF';
> 
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> Does this work for other people?  I could probably fix it by changing "add 
> jar" to download remote jars locally, when necessary (to load them into the 
> classpath), or update URLClassLoader (or whatever is underneath there) to 
> read directly from HDFS, which seems a bit more fragile.  But I wanted to 
> make sure that my interpretation of what's going on is right before I have at 
> it.
> Thanks,
> -- Philip
> {quote}
> {quote}
> Yes that's correct. I prefer to download the jars in "add jar".
> Zheng
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-03-05 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841987#action_12841987
 ] 

Philip Zeyliger commented on HIVE-1157:
---

Has anyone had a chance to look at this?  Would appreciate the feedback!

Thanks!

> UDFs can't be loaded via "add jar" when jar is on HDFS
> --
>
> Key: HIVE-1157
> URL: https://issues.apache.org/jira/browse/HIVE-1157
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Philip Zeyliger
>Priority: Minor
> Attachments: hive-1157.patch.txt
>
>
> As discussed on the mailing list, it would be nice if you could use UDFs that 
> are on jars on HDFS.  The proposed implementation would be for "add jar" to 
> recognize that the target file is on HDFS, copy it locally, and load it into 
> the classpath.
> {quote}
> Hi folks,
> I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
> Can you use a UDF where the jar which contains the function is on HDFS, and 
> not on the local filesystem.  Specifically, the following does not seem to 
> work:
> # This is Hive 0.5, from svn
> $bin/hive  
> Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
> hive> add jar hdfs://localhost/FooTest.jar;   
>
> Added hdfs://localhost/FooTest.jar to class path
> hive> create temporary function cube as 'com.cloudera.FooTestUDF';
> 
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> Does this work for other people?  I could probably fix it by changing "add 
> jar" to download remote jars locally, when necessary (to load them into the 
> classpath), or update URLClassLoader (or whatever is underneath there) to 
> read directly from HDFS, which seems a bit more fragile.  But I wanted to 
> make sure that my interpretation of what's going on is right before I have at 
> it.
> Thanks,
> -- Philip
> {quote}
> {quote}
> Yes that's correct. I prefer to download the jars in "add jar".
> Zheng
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-03-02 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger updated HIVE-1157:
--

Attachment: hive-1157.patch.txt

This patch changes SessionState.java to copy jar resources locally, if they're 
not local already.

Because I had to manage additional per-resource state (namely, the location of 
the local copy, so that it can be cleaned up), I modified the ResourceType enum 
to be simply an enum, and now there is one ResourceHook object per resource, 
not per resource type.  I changed the container map to be an EnumMap.

It turns out that you can't specify an HDFS path to "-libjars", so I had to 
also modify ExecDriver.java to call a special method when it's getting jar 
resources.

I would appreciate some guidance on how to test this best.  So far, I've 
manually done the following steps:
{noformat}
create table t (x int);
# Create a file with "1\n2\n3\n" as /tmp/a.
load data local inpath '/tmp/a' into table t;
add jar hdfs://localhost:8020/Test.jar;
create temporary function cube as 'org.apache.hive.test.CubeSampleUDF';  # I 
wrote this
select cube(x) from t;
{noformat}
What else would it be reasonable for me to do?  It looks like there's no DFS in 
the test environment.  I might be able to register an ad-hoc file system 
implementation of some sort or use mockito or some such...  What do you 
recommend?

I'm running the existing tests to make sure that I haven't broken anything.  
These seem to take a while, so I'll report back.

> UDFs can't be loaded via "add jar" when jar is on HDFS
> --
>
> Key: HIVE-1157
> URL: https://issues.apache.org/jira/browse/HIVE-1157
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Philip Zeyliger
>Priority: Minor
> Attachments: hive-1157.patch.txt
>
>
> As discussed on the mailing list, it would be nice if you could use UDFs that 
> are on jars on HDFS.  The proposed implementation would be for "add jar" to 
> recognize that the target file is on HDFS, copy it locally, and load it into 
> the classpath.
> {quote}
> Hi folks,
> I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
> Can you use a UDF where the jar which contains the function is on HDFS, and 
> not on the local filesystem.  Specifically, the following does not seem to 
> work:
> # This is Hive 0.5, from svn
> $bin/hive  
> Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
> hive> add jar hdfs://localhost/FooTest.jar;   
>
> Added hdfs://localhost/FooTest.jar to class path
> hive> create temporary function cube as 'com.cloudera.FooTestUDF';
> 
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> Does this work for other people?  I could probably fix it by changing "add 
> jar" to download remote jars locally, when necessary (to load them into the 
> classpath), or update URLClassLoader (or whatever is underneath there) to 
> read directly from HDFS, which seems a bit more fragile.  But I wanted to 
> make sure that my interpretation of what's going on is right before I have at 
> it.
> Thanks,
> -- Philip
> {quote}
> {quote}
> Yes that's correct. I prefer to download the jars in "add jar".
> Zheng
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-02-10 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832389#action_12832389
 ] 

Philip Zeyliger commented on HIVE-1157:
---

Edward,

I'm not sure what you mean.

-- Philip

> UDFs can't be loaded via "add jar" when jar is on HDFS
> --
>
> Key: HIVE-1157
> URL: https://issues.apache.org/jira/browse/HIVE-1157
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Philip Zeyliger
>Priority: Minor
>
> As discussed on the mailing list, it would be nice if you could use UDFs that 
> are on jars on HDFS.  The proposed implementation would be for "add jar" to 
> recognize that the target file is on HDFS, copy it locally, and load it into 
> the classpath.
> {quote}
> Hi folks,
> I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  
> Can you use a UDF where the jar which contains the function is on HDFS, and 
> not on the local filesystem.  Specifically, the following does not seem to 
> work:
> # This is Hive 0.5, from svn
> $bin/hive  
> Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
> hive> add jar hdfs://localhost/FooTest.jar;   
>
> Added hdfs://localhost/FooTest.jar to class path
> hive> create temporary function cube as 'com.cloudera.FooTestUDF';
> 
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> Does this work for other people?  I could probably fix it by changing "add 
> jar" to download remote jars locally, when necessary (to load them into the 
> classpath), or update URLClassLoader (or whatever is underneath there) to 
> read directly from HDFS, which seems a bit more fragile.  But I wanted to 
> make sure that my interpretation of what's going on is right before I have at 
> it.
> Thanks,
> -- Philip
> {quote}
> {quote}
> Yes that's correct. I prefer to download the jars in "add jar".
> Zheng
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1157) UDFs can't be loaded via "add jar" when jar is on HDFS

2010-02-10 Thread Philip Zeyliger (JIRA)
UDFs can't be loaded via "add jar" when jar is on HDFS
--

 Key: HIVE-1157
 URL: https://issues.apache.org/jira/browse/HIVE-1157
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Philip Zeyliger
Priority: Minor


As discussed on the mailing list, it would be nice if you could use UDFs that 
are on jars on HDFS.  The proposed implementation would be for "add jar" to 
recognize that the target file is on HDFS, copy it locally, and load it into 
the classpath.

{quote}
Hi folks,

I have a quick question about UDF support in Hive.  I'm on the 0.5 branch.  Can 
you use a UDF where the jar which contains the function is on HDFS, and not on 
the local filesystem.  Specifically, the following does not seem to work:

# This is Hive 0.5, from svn
$bin/hive  
Hive history file=/tmp/philip/hive_job_log_philip_201002081541_370227273.txt
hive> add jar hdfs://localhost/FooTest.jar; 
 
Added hdfs://localhost/FooTest.jar to class path
hive> create temporary function cube as 'com.cloudera.FooTestUDF';  
  
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.FunctionTask

Does this work for other people?  I could probably fix it by changing "add jar" 
to download remote jars locally, when necessary (to load them into the 
classpath), or update URLClassLoader (or whatever is underneath there) to read 
directly from HDFS, which seems a bit more fragile.  But I wanted to make sure 
that my interpretation of what's going on is right before I have at it.

Thanks,

-- Philip
{quote}

{quote}
Yes that's correct. I prefer to download the jars in "add jar".

Zheng
{quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-802) Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it

2010-02-02 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12828844#action_12828844
 ] 

Philip Zeyliger commented on HIVE-802:
--

If we uploaded a patched datanucleus to solve this issue, would folks be 
alright including it in Hive (and possibly the 0.5 branch)?

I ran into this again recently, trying to run Hive's 0.5 metastore server 
against the current version of Cloudera's distribution, and it took a while to 
decipher the error.

Thanks,

-- Philip

> Bug in DataNucleus prevents Hive from building if inside a dir with '+' in it
> -
>
> Key: HIVE-802
> URL: https://issues.apache.org/jira/browse/HIVE-802
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Todd Lipcon
>
> There's a bug in DataNucleus that causes this issue:
> http://www.jpox.org/servlet/jira/browse/NUCCORE-371
> To reproduce, simply put your hive source tree in a directory that contains a 
> '+' character.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.