RE: Hive query failing

2012-09-25 Thread Manish . Bhoge
Sarath,

Is this the external table where you have ran the query? How did you loaded the 
table?  Because it looks like the error is about the file related to table than 
CDH Jar.

Thank You,
Manish

From: Sarath [mailto:sarathchandra.jos...@algofusiontech.com]
Sent: Tuesday, September 25, 2012 3:48 PM
To: user@hive.apache.org
Subject: Hive query failing

Hi,

When I run the query select count(1) from table1; it fails with the exception 
message as below.

java.io.FileNotFoundException: File does not exist: 
/usr/lib/hive/lib/hive-builtins-0.8.1-cdh4.0.1.jar
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:245)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:283)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:354)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:609)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:604)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:604)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:710)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Job Submission failed with exception 'java.io.FileNotFoundException(File does 
not exist: /usr/lib/hive/lib/hive-builtins-0.8.1-cdh4.0.1.jar)'
Execution failed with exit status: 2

But the JAR mentioned in the message exists and appropriate R/W permissions 
have been set on the folder /usr/lib/hive for the user.
What is going wrong?

Regards,
Sarath.


RE: Custom MR scripts using java in Hive

2012-09-25 Thread Manish . Bhoge
Manu,

If you have written UDF in Java for Hive then you need to copy your JAR on your 
Hadoop cluster in /usr/lib/hive/lib/ folder to hive to use this JAR.

Thank You,
Manish

From: Manu A [mailto:hadoophi...@gmail.com]
Sent: Tuesday, September 25, 2012 3:44 PM
To: user@hive.apache.org
Subject: Custom MR scripts using java in Hive

Hi All,
I am learning hive. Please let me know if any one tried with custom Map Reduce 
scripts using java in hive or refer me some links and blogs with an example.

when i tried i got the below error :

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2012-09-25 02:47:23,720 Stage-1 map = 0%,  reduce = 0%
2012-09-25 02:47:56,943 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_20120931_0001 with errors
Error during job, obtaining debugging information...
Examining task ID: task_20120931_0001_m_02 (and more) from job 
job_20120931_0001
Exception in thread Thread-51 java.lang.RuntimeException: Error while reading 
from task log url
at 
org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
at 
org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Server returned HTTP response code: 400 for 
URL: // removed as confidential
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1313)
at java.net.URL.openStream(URL.java:1010)
at 
org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
... 3 more
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec




Thanks for ur help in advance :)



Thanks  Regards,
Manu








Re: Hive query failing

2012-09-25 Thread kulkarni . swarnim
The jar is being looked on HDFS as the exception suggests. Run the following 
commands:

$ hadoop fs -mkdir /usr/lib/hive/lib
$ hadoop fs -put $HIVE_HOME/lib/hive-builtins-0.8.1-cdh4.0.1.jar 
/usr/lib/hive/lib

Your queries should work now.

On Sep 25, 2012, at 6:46 AM, Manish.Bhoge manish.bh...@target.com wrote:

 Sarath,
  
 Is this the external table where you have ran the query? How did you loaded 
 the table?  Because it looks like the error is about the file related to 
 table than CDH Jar.
  
 Thank You,
 Manish
  
 From: Sarath [mailto:sarathchandra.jos...@algofusiontech.com] 
 Sent: Tuesday, September 25, 2012 3:48 PM
 To: user@hive.apache.org
 Subject: Hive query failing
  
 Hi,
 
 When I run the query select count(1) from table1; it fails with the 
 exception message as below.
 
 java.io.FileNotFoundException: File does not exist: 
 /usr/lib/hive/lib/hive-builtins-0.8.1-cdh4.0.1.jar
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
 at 
 org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:245)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:283)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:354)
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:609)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:604)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:604)
 at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)
 at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:710)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
 Job Submission failed with exception 'java.io.FileNotFoundException(File does 
 not exist: /usr/lib/hive/lib/hive-builtins-0.8.1-cdh4.0.1.jar)'
 Execution failed with exit status: 2
 
 But the JAR mentioned in the message exists and appropriate R/W permissions 
 have been set on the folder /usr/lib/hive for the user.
 What is going wrong?
 
 Regards,
 Sarath.


Re: Custom MR scripts using java in Hive

2012-09-25 Thread Tamil A
Hi Manish,

Thanks for your help.I did the same using UDF.Now trying with Transform,Map
and Reduce clauses.so is it mean by using java we have to goahead through
UDF and for other languages using MapReduce Scripts i.e., the Transform,Map
and Reduce clauses.
Please correct me if am wrong.




Thanks  Regards,

Manu 

On Tue, Sep 25, 2012 at 5:19 PM, Manish.Bhoge manish.bh...@target.comwrote:

  Manu,

 ** **

 If you have written UDF in Java for Hive then you need to copy your JAR on
 your Hadoop cluster in /usr/lib/hive/lib/ folder to hive to use this JAR.
 

 ** **

 Thank You,

 Manish

 ** **

 *From:* Manu A [mailto:hadoophi...@gmail.com]
 *Sent:* Tuesday, September 25, 2012 3:44 PM
 *To:* user@hive.apache.org
 *Subject:* Custom MR scripts using java in Hive

 ** **

 Hi All,

 I am learning hive. Please let me know if any one tried with custom Map
 Reduce scripts using java in hive or refer me some links and blogs with an
 example.

  

 when i tried i got the below error :

  

 Hadoop job information for Stage-1: number of mappers: 1; number of
 reducers: 0
 2012-09-25 02:47:23,720 Stage-1 map = 0%,  reduce = 0%
 2012-09-25 02:47:56,943 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_20120931_0001 with errors
 Error during job, obtaining debugging information...
 Examining task ID: task_20120931_0001_m_02 (and more) from job
 job_20120931_0001
 Exception in thread Thread-51 java.lang.RuntimeException: Error while
 reading from task log url
 at
 org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
 at
 org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
 at
 org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: Server returned HTTP response code: 400
 for URL: // removed as confidential

 at
 sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1313)
 at java.net.URL.openStream(URL.java:1010)
 at
 org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
 ... 3 more
 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask
 MapReduce Jobs Launched:
 Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
 Total MapReduce CPU Time Spent: 0 msec

  

  

  

  

 Thanks for ur help in advance :)

  

  

  

 Thanks  Regards,

 Manu 

  

  

  

  

 ** **

 ** **




-- 
*Thanks  Regards,*
*Tamil*


Re: Custom MR scripts using java in Hive

2012-09-25 Thread Manu A
Thanks Manish. ll try with the same.



Thanks  Regards,

Manu 

 


On Tue, Sep 25, 2012 at 5:19 PM, Manish.Bhoge manish.bh...@target.comwrote:

  Manu,

 ** **

 If you have written UDF in Java for Hive then you need to copy your JAR on
 your Hadoop cluster in /usr/lib/hive/lib/ folder to hive to use this JAR.
 

 ** **

 Thank You,

 Manish

 ** **

 *From:* Manu A [mailto:hadoophi...@gmail.com]
 *Sent:* Tuesday, September 25, 2012 3:44 PM
 *To:* user@hive.apache.org
 *Subject:* Custom MR scripts using java in Hive

 ** **

 Hi All,

 I am learning hive. Please let me know if any one tried with custom Map
 Reduce scripts using java in hive or refer me some links and blogs with an
 example.

  

 when i tried i got the below error :

  

 Hadoop job information for Stage-1: number of mappers: 1; number of
 reducers: 0
 2012-09-25 02:47:23,720 Stage-1 map = 0%,  reduce = 0%
 2012-09-25 02:47:56,943 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_20120931_0001 with errors
 Error during job, obtaining debugging information...
 Examining task ID: task_20120931_0001_m_02 (and more) from job
 job_20120931_0001
 Exception in thread Thread-51 java.lang.RuntimeException: Error while
 reading from task log url
 at
 org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
 at
 org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
 at
 org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: Server returned HTTP response code: 400
 for URL: // removed as confidential

 at
 sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1313)
 at java.net.URL.openStream(URL.java:1010)
 at
 org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
 ... 3 more
 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask
 MapReduce Jobs Launched:
 Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
 Total MapReduce CPU Time Spent: 0 msec

  

  

  

  

 Thanks for ur help in advance :)

  

  

  

 Thanks  Regards,

 Manu 

  

  

  

  

 ** **

 ** **



Re: Custom MR scripts using java in Hive

2012-09-25 Thread Bertrand Dechoux
For java, you can also consider reflection :
http://hive.apache.org/docs/r0.9.0/udf/reflect.html

Regards

Bertrand

On Tue, Sep 25, 2012 at 3:18 PM, Tamil A 4tamil...@gmail.com wrote:

 Hi Manish,

 Thanks for your help.I did the same using UDF.Now trying with
 Transform,Map and Reduce clauses.so is it mean by using java we have to
 goahead through UDF and for other languages using MapReduce Scripts i.e.,
 the Transform,Map and Reduce clauses.
  Please correct me if am wrong.




 Thanks  Regards,

 Manu 

 On Tue, Sep 25, 2012 at 5:19 PM, Manish.Bhoge manish.bh...@target.comwrote:

  Manu,

 ** **

 If you have written UDF in Java for Hive then you need to copy your JAR
 on your Hadoop cluster in /usr/lib/hive/lib/ folder to hive to use this
 JAR. 

 ** **

 Thank You,

 Manish

 ** **

 *From:* Manu A [mailto:hadoophi...@gmail.com]
 *Sent:* Tuesday, September 25, 2012 3:44 PM
 *To:* user@hive.apache.org
 *Subject:* Custom MR scripts using java in Hive

 ** **

 Hi All,

 I am learning hive. Please let me know if any one tried with custom Map
 Reduce scripts using java in hive or refer me some links and blogs with an
 example.

  

 when i tried i got the below error :

  

 Hadoop job information for Stage-1: number of mappers: 1; number of
 reducers: 0
 2012-09-25 02:47:23,720 Stage-1 map = 0%,  reduce = 0%
 2012-09-25 02:47:56,943 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_20120931_0001 with errors
 Error during job, obtaining debugging information...
 Examining task ID: task_20120931_0001_m_02 (and more) from job
 job_20120931_0001
 Exception in thread Thread-51 java.lang.RuntimeException: Error while
 reading from task log url
 at
 org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
 at
 org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
 at
 org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: Server returned HTTP response code: 400
 for URL: // removed as confidential

 at
 sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1313)
 at java.net.URL.openStream(URL.java:1010)
 at
 org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
 ... 3 more
 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask
 MapReduce Jobs Launched:
 Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
 Total MapReduce CPU Time Spent: 0 msec

  

  

  

  

 Thanks for ur help in advance :)

  

  

  

 Thanks  Regards,

 Manu 

  

  

  

  

 ** **

 ** **




 --
 *Thanks  Regards,*
 *Tamil*




-- 
Bertrand Dechoux


How connect to hive server without using jdbc

2012-09-25 Thread Abhishek
Hi all,

Is there any way to connect to hive server through API??

Regards
Abhi



Sent from my iPhone


Re: How connect to hive server without using jdbc

2012-09-25 Thread Haijia Zhou
https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC

On Tue, Sep 25, 2012 at 1:13 PM, Abhishek abhishek.dod...@gmail.com wrote:

 Hi all,

 Is there any way to connect to hive server through API??

 Regards
 Abhi



 Sent from my iPhone



Re: How connect to hive server without using jdbc

2012-09-25 Thread Abhishek
Hi Zhou,

But Iam looking to connect to hive server without jdbc. Some other way through 
API

Regards
Abhi

Sent from my iPhone

On Sep 25, 2012, at 1:15 PM, Haijia Zhou leons...@gmail.com wrote:

 https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC
 
 On Tue, Sep 25, 2012 at 1:13 PM, Abhishek abhishek.dod...@gmail.com wrote:
 Hi all,
 
 Is there any way to connect to hive server through API??
 
 Regards
 Abhi
 
 
 
 Sent from my iPhone
 


Re: How connect to hive server without using jdbc

2012-09-25 Thread Haijia Zhou
the page also contains information about using other APIs to connect to
Hive.

On Tue, Sep 25, 2012 at 1:44 PM, Abhishek abhishek.dod...@gmail.com wrote:

 Hi Zhou,

 But Iam looking to connect to hive server without jdbc. Some other way
 through API

 Regards
 Abhi

 Sent from my iPhone

 On Sep 25, 2012, at 1:15 PM, Haijia Zhou leons...@gmail.com wrote:

 https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC

 On Tue, Sep 25, 2012 at 1:13 PM, Abhishek abhishek.dod...@gmail.comwrote:

 Hi all,

 Is there any way to connect to hive server through API??

 Regards
 Abhi



 Sent from my iPhone





Re: How connect to hive server without using jdbc

2012-09-25 Thread Doug Houck
Also, this is all open source, right?  So you could take a look at the CLI code 
to see how it does it.

- Original Message -
From: Abhishek abhishek.dod...@gmail.com
To: user@hive.apache.org
Cc: user@hive.apache.org
Sent: Tuesday, September 25, 2012 1:44:41 PM
Subject: Re: How connect to hive server without using jdbc


Hi Zhou, 


But Iam looking to connect to hive server without jdbc. Some other way through 
API 


Regards 
Abhi 

Sent from my iPhone 

On Sep 25, 2012, at 1:15 PM, Haijia Zhou  leons...@gmail.com  wrote: 




https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC 


On Tue, Sep 25, 2012 at 1:13 PM, Abhishek  abhishek.dod...@gmail.com  wrote: 


Hi all, 

Is there any way to connect to hive server through API?? 

Regards 
Abhi 



Sent from my iPhone 



Re: How connect to hive server without using jdbc

2012-09-25 Thread Abhishek
Hi Zhou,

Thanks for the reply, we are shutting down thrift service due to security 
issues with hive.

Regards
Abhi

Sent from my iPhone

On Sep 25, 2012, at 1:53 PM, Doug Houck doug.ho...@trustedconcepts.com wrote:

 Also, this is all open source, right?  So you could take a look at the CLI 
 code to see how it does it.
 
 - Original Message -
 From: Abhishek abhishek.dod...@gmail.com
 To: user@hive.apache.org
 Cc: user@hive.apache.org
 Sent: Tuesday, September 25, 2012 1:44:41 PM
 Subject: Re: How connect to hive server without using jdbc
 
 
 Hi Zhou, 
 
 
 But Iam looking to connect to hive server without jdbc. Some other way 
 through API 
 
 
 Regards 
 Abhi 
 
 Sent from my iPhone 
 
 On Sep 25, 2012, at 1:15 PM, Haijia Zhou  leons...@gmail.com  wrote: 
 
 
 
 
 https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC 
 
 
 On Tue, Sep 25, 2012 at 1:13 PM, Abhishek  abhishek.dod...@gmail.com  
 wrote: 
 
 
 Hi all, 
 
 Is there any way to connect to hive server through API?? 
 
 Regards 
 Abhi 
 
 
 
 Sent from my iPhone 
 


Re: How connect to hive server without using jdbc

2012-09-25 Thread Abhishek
Hi Doug,

Thanks for the reply.Can you point me to the CLI code.

Regards 
Abhi

Sent from my iPhone

On Sep 25, 2012, at 1:53 PM, Doug Houck doug.ho...@trustedconcepts.com wrote:

 Also, this is all open source, right?  So you could take a look at the CLI 
 code to see how it does it.
 
 - Original Message -
 From: Abhishek abhishek.dod...@gmail.com
 To: user@hive.apache.org
 Cc: user@hive.apache.org
 Sent: Tuesday, September 25, 2012 1:44:41 PM
 Subject: Re: How connect to hive server without using jdbc
 
 
 Hi Zhou, 
 
 
 But Iam looking to connect to hive server without jdbc. Some other way 
 through API 
 
 
 Regards 
 Abhi 
 
 Sent from my iPhone 
 
 On Sep 25, 2012, at 1:15 PM, Haijia Zhou  leons...@gmail.com  wrote: 
 
 
 
 
 https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC 
 
 
 On Tue, Sep 25, 2012 at 1:13 PM, Abhishek  abhishek.dod...@gmail.com  
 wrote: 
 
 
 Hi all, 
 
 Is there any way to connect to hive server through API?? 
 
 Regards 
 Abhi 
 
 
 
 Sent from my iPhone 
 


Hive File Sizes, Merging, and Splits

2012-09-25 Thread John Omernik
I am really struggling trying to make hears or tails out of how to optimize
the data in my tables for best query times.  I have a partition that is
compressed (Gzip) RCFile data in two files

total 421877
263715 -rwxr-xr-x 1 darkness darkness 270044140 2012-09-25 13:32 00_0
158162 -rwxr-xr-x 1 darkness darkness 161956948 2012-09-25 13:32 01_0



No matter what I set my split settings to prior to the job, I always get
three mappers.  My block size is 268435456 but the setting doesn't seem to
change anything. I can set split size huge or small with no apparent affect
on the data.


I know there are many esoteric items here, but is there any good
documentation on setting these things to make my queries on this data more
efficient. I am not sure what it needs three map tasks on this data, it
should really just grab two mappers. Not to mention, I thought gzip wasn't
splitable anyhow.  So, from that standpoint, how does it even send data to
three mappers.  If you know of some secret cache of documentation for hive,
I'd love to read it.

Thanks


How can I get the constant value from the ObjectInspector in the UDF

2012-09-25 Thread java8964 java8964

Hi, I am using Cloudera release cdh3u3, which has the hive 0.71 version.
I am trying to write a hive UDF function as to calculate the moving sum. Right 
now, I am having trouble to get the constrant value passed in in the 
initialization stage.
For example, let's assume the function is like the following format:
msum(salary, 10) - salary is a int type column
which means the end user wants to calculate the last 10 rows of salary.
I kind of know how to implement this UDF. But I have one problem right now.
1) This is not a UDAF, as each row will return one data back as the moving 
sum.2) I create an UDF class extends from the GenericUDF.3) I can get the 
column type from the ObjectInspector[] passed to me in the initialize() method 
to verify that 'salary' and 10 both needs to be numeric type (later one needs 
to be integer)4) But I also want to get the real value of 10, in this case, in 
the initialize() stage, so I can create the corresponding data structure based 
on the value end user specified here.5) I looks around the javadoc of 
ObjectInspector class. I know at run time the real class of the 2nd parameter 
is WritableIntObjectInspector. I can get the type, but how I can get the real 
value of it?6) This is kind of ConstantsObjectInspector, should be able to give 
the value to me, as it already knows the type is int. What how?7) I don't want 
to try to get the value at the evaluate stage. Can I get this value at the 
initialize stage?
Thanks
Yong  

Re: How connect to hive server without using jdbc

2012-09-25 Thread Bertrand Dechoux
I am interested in the response of Dilip.
I didn't look at the details and stopped my proof of concept using hive
with jdbc due to the thrift server not being  'concurrent safe'.

Of course, like Dilip said, the driver call can be done directly from java
(or any languages having a binding to it). So it must be pretty trivial to
create a java application 'forwarding' the request. But so... I am not
understanding the issue with the thrift server. Its usage is meant to be
much more generic and that's why the current implementation is not safe
because it is much more complex to implement? (I have no experience with
thrift so my questions my be trivial.)

Regards

Bertrand

On Wed, Sep 26, 2012 at 6:46 AM, Dilip Joseph dilip.antony.jos...@gmail.com
 wrote:

 You don't necessarily need to run the thrift service to use JDBC.  Please
 see:
 http://csgrad.blogspot.in/2010/04/to-use-language-other-than-java-say.html
 .

 Dilip

 On Tue, Sep 25, 2012 at 11:01 AM, Abhishek abhishek.dod...@gmail.comwrote:

 Hi Zhou,

 Thanks for the reply, we are shutting down thrift service due to security
 issues with hive.

 Regards
 Abhi

 Sent from my iPhone

 On Sep 25, 2012, at 1:53 PM, Doug Houck doug.ho...@trustedconcepts.com
 wrote:

  Also, this is all open source, right?  So you could take a look at the
 CLI code to see how it does it.
 
  - Original Message -
  From: Abhishek abhishek.dod...@gmail.com
  To: user@hive.apache.org
  Cc: user@hive.apache.org
  Sent: Tuesday, September 25, 2012 1:44:41 PM
  Subject: Re: How connect to hive server without using jdbc
 
 
  Hi Zhou,
 
 
  But Iam looking to connect to hive server without jdbc. Some other way
 through API
 
 
  Regards
  Abhi
 
  Sent from my iPhone
 
  On Sep 25, 2012, at 1:15 PM, Haijia Zhou  leons...@gmail.com  wrote:
 
 
 
 
  https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC
 
 
  On Tue, Sep 25, 2012 at 1:13 PM, Abhishek  abhishek.dod...@gmail.com 
  wrote:
 
 
  Hi all,
 
  Is there any way to connect to hive server through API??
 
  Regards
  Abhi
 
 
 
  Sent from my iPhone
 




 --
 _
 Dilip Antony Joseph
 http://csgrad.blogspot.com
 http://www.marydilip.info




-- 
Bertrand Dechoux