Re: Hive query failing

2012-09-25 Thread Sarath

Somehow (by resetting folder permissions) I got rid of the below error.
But now I'm getting a new error as below. Looks like I'm missing some 
configuration, but not sure what and where.


/hive> select count(1) from table1;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapred.reduce.tasks=
java.io.IOException: Cannot initialize Cluster. Please check your 
configuration for mapreduce.framework.name and the correspond server 
addresses.

at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:83)
at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:76)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:487)
at org.apache.hadoop.mapred.JobClient.(JobClient.java:466)
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:435)
at 
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)

at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)

at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)

at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Job Submission failed with exception 'java.io.IOException(Cannot 
initialize Cluster. Please check your configuration for 
mapreduce.framework.name and the correspond server addresses.)'
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MapRedTask/


On Tuesday 25 September 2012 05:39 PM, kulkarni.swar...@gmail.com wrote:
The jar is being looked on HDFS as the exception suggests. Run the 
following commands:


$ hadoop fs -mkdir //usr/lib/hive/lib/
/$ hadoop fs -put $HIVE_HOME/lib/hive-builtins-0.8.1-cdh4.0.1.jar 
///usr/lib/hive/lib/


Your queries should work now.

On Sep 25, 2012, at 6:46 AM, Manish.Bhoge > wrote:



Sarath,

Is this the external table where you have ran the query? How did you 
loaded the table?  Because it looks like the error is about the file 
related to table than CDH Jar.


Thank You,

Manish

*From:*Sarath [mailto:sarathchandra.jos...@algofusiontech.com]
*Sent:* Tuesday, September 25, 2012 3:48 PM
*To:* user@hive.apache.org 
*Subject:* Hive query failing

Hi,

When I run the query "select count(1) from table1;" it fails with the 
exception message as below.


/java.io.FileNotFoundException: File does not exist: 
/usr/lib/hive/lib/hive-builtins-0.8.1-cdh4.0.1.jar
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:245)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:283)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:354)

at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:609)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:604)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)

  

RE: Custom MR scripts using java in Hive

2012-09-25 Thread Manish . Bhoge
Sorry for late reply.

For anything which you want to run as MAP and REDUCE you have to extend map 
reduce classes for your functionality irrespective of language (Java, python or 
any other).  Once you have extended class move the jar to the Hadoop cluster.
Bertrand has also mention about reflection. That is something new for me. You 
can give a try to reflection.

Thank You,
Manish

From: Tamil A [mailto:4tamil...@gmail.com]
Sent: Tuesday, September 25, 2012 6:48 PM
To: user@hive.apache.org
Subject: Re: Custom MR scripts using java in Hive

Hi Manish,

Thanks for your help.I did the same using UDF.Now trying with Transform,Map and 
Reduce clauses.so is it mean by using java we have to goahead through UDF and 
for other languages using MapReduce Scripts i.e., the Transform,Map and Reduce 
clauses.
Please correct me if am wrong.



Thanks & Regards,
Manu

On Tue, Sep 25, 2012 at 5:19 PM, Manish.Bhoge 
mailto:manish.bh...@target.com>> wrote:
Manu,

If you have written UDF in Java for Hive then you need to copy your JAR on your 
Hadoop cluster in /usr/lib/hive/lib/ folder to hive to use this JAR.

Thank You,
Manish

From: Manu A [mailto:hadoophi...@gmail.com]
Sent: Tuesday, September 25, 2012 3:44 PM
To: user@hive.apache.org
Subject: Custom MR scripts using java in Hive

Hi All,
I am learning hive. Please let me know if any one tried with custom Map Reduce 
scripts using java in hive or refer me some links and blogs with an example.

when i tried i got the below error :

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2012-09-25 02:47:23,720 Stage-1 map = 0%,  reduce = 0%
2012-09-25 02:47:56,943 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_20120931_0001 with errors
Error during job, obtaining debugging information...
Examining task ID: task_20120931_0001_m_02 (and more) from job 
job_20120931_0001
Exception in thread "Thread-51" java.lang.RuntimeException: Error while reading 
from task log url
at 
org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
at 
org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Server returned HTTP response code: 400 for 
URL: // removed as confidential
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1313)
at java.net.URL.openStream(URL.java:1010)
at 
org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
... 3 more
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec




Thanks for ur help in advance :)



Thanks & Regards,
Manu









--
Thanks & Regards,
Tamil



Re: How connect to hive server without using jdbc

2012-09-25 Thread Bertrand Dechoux
I am interested in the response of Dilip.
I didn't look at the details and stopped my proof of concept using hive
with jdbc due to the thrift server not being  'concurrent safe'.

Of course, like Dilip said, the driver call can be done directly from java
(or any languages having a binding to it). So it must be pretty trivial to
create a java application 'forwarding' the request. But so... I am not
understanding the issue with the thrift server. Its usage is meant to be
much more generic and that's why the current implementation is not safe
because it is much more complex to implement? (I have no experience with
thrift so my questions my be trivial.)

Regards

Bertrand

On Wed, Sep 26, 2012 at 6:46 AM, Dilip Joseph  wrote:

> You don't necessarily need to run the thrift service to use JDBC.  Please
> see:
> http://csgrad.blogspot.in/2010/04/to-use-language-other-than-java-say.html
> .
>
> Dilip
>
> On Tue, Sep 25, 2012 at 11:01 AM, Abhishek wrote:
>
>> Hi Zhou,
>>
>> Thanks for the reply, we are shutting down thrift service due to security
>> issues with hive.
>>
>> Regards
>> Abhi
>>
>> Sent from my iPhone
>>
>> On Sep 25, 2012, at 1:53 PM, Doug Houck 
>> wrote:
>>
>> > Also, this is all open source, right?  So you could take a look at the
>> CLI code to see how it does it.
>> >
>> > - Original Message -
>> > From: "Abhishek" 
>> > To: user@hive.apache.org
>> > Cc: user@hive.apache.org
>> > Sent: Tuesday, September 25, 2012 1:44:41 PM
>> > Subject: Re: How connect to hive server without using jdbc
>> >
>> >
>> > Hi Zhou,
>> >
>> >
>> > But Iam looking to connect to hive server without jdbc. Some other way
>> through API
>> >
>> >
>> > Regards
>> > Abhi
>> >
>> > Sent from my iPhone
>> >
>> > On Sep 25, 2012, at 1:15 PM, Haijia Zhou < leons...@gmail.com > wrote:
>> >
>> >
>> >
>> >
>> > https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC
>> >
>> >
>> > On Tue, Sep 25, 2012 at 1:13 PM, Abhishek < abhishek.dod...@gmail.com> 
>> > wrote:
>> >
>> >
>> > Hi all,
>> >
>> > Is there any way to connect to hive server through API??
>> >
>> > Regards
>> > Abhi
>> >
>> >
>> >
>> > Sent from my iPhone
>> >
>>
>
>
>
> --
> _
> Dilip Antony Joseph
> http://csgrad.blogspot.com
> http://www.marydilip.info
>



-- 
Bertrand Dechoux


Re: How connect to hive server without using jdbc

2012-09-25 Thread Dilip Joseph
You don't necessarily need to run the thrift service to use JDBC.  Please
see:
http://csgrad.blogspot.in/2010/04/to-use-language-other-than-java-say.html.

Dilip

On Tue, Sep 25, 2012 at 11:01 AM, Abhishek wrote:

> Hi Zhou,
>
> Thanks for the reply, we are shutting down thrift service due to security
> issues with hive.
>
> Regards
> Abhi
>
> Sent from my iPhone
>
> On Sep 25, 2012, at 1:53 PM, Doug Houck 
> wrote:
>
> > Also, this is all open source, right?  So you could take a look at the
> CLI code to see how it does it.
> >
> > - Original Message -
> > From: "Abhishek" 
> > To: user@hive.apache.org
> > Cc: user@hive.apache.org
> > Sent: Tuesday, September 25, 2012 1:44:41 PM
> > Subject: Re: How connect to hive server without using jdbc
> >
> >
> > Hi Zhou,
> >
> >
> > But Iam looking to connect to hive server without jdbc. Some other way
> through API
> >
> >
> > Regards
> > Abhi
> >
> > Sent from my iPhone
> >
> > On Sep 25, 2012, at 1:15 PM, Haijia Zhou < leons...@gmail.com > wrote:
> >
> >
> >
> >
> > https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC
> >
> >
> > On Tue, Sep 25, 2012 at 1:13 PM, Abhishek < abhishek.dod...@gmail.com >
> wrote:
> >
> >
> > Hi all,
> >
> > Is there any way to connect to hive server through API??
> >
> > Regards
> > Abhi
> >
> >
> >
> > Sent from my iPhone
> >
>



-- 
_
Dilip Antony Joseph
http://csgrad.blogspot.com
http://www.marydilip.info


RE: Hive File Sizes, Merging, and Splits

2012-09-25 Thread Connell, Chuck
But remember that you are running on parallel machines. Depending on the 
hardware configuration, more map tasks is BETTER.



From: John Omernik [j...@omernik.com]
Sent: Tuesday, September 25, 2012 7:11 PM
To: user@hive.apache.org
Subject: Re: Hive File Sizes, Merging, and Splits

Isn't there an overhead associated with each map task?  Based on that, my 
hypothesis is if I pay attention to may data, merge up small files after load, 
and ensure split sizes are close to files sizes, I can keep the number of map 
tasks to an absolute minimum.


On Tue, Sep 25, 2012 at 2:35 PM, Connell, Chuck 
mailto:chuck.conn...@nuance.com>> wrote:
Why do you think the current generated code is inefficient?



From: John Omernik [mailto:j...@omernik.com]
Sent: Tuesday, September 25, 2012 2:57 PM
To: user@hive.apache.org
Subject: Hive File Sizes, Merging, and Splits

I am really struggling trying to make hears or tails out of how to optimize the 
data in my tables for best query times.  I have a partition that is compressed 
(Gzip) RCFile data in two files

total 421877
263715 -rwxr-xr-x 1 darkness darkness 270044140 2012-09-25 13:32 00_0
158162 -rwxr-xr-x 1 darkness darkness 161956948 2012-09-25 13:32 01_0



No matter what I set my split settings to prior to the job, I always get three 
mappers.  My block size is 268435456 but the setting doesn't seem to change 
anything. I can set split size huge or small with no apparent affect on the 
data.


I know there are many esoteric items here, but is there any good documentation 
on setting these things to make my queries on this data more efficient. I am 
not sure what it needs three map tasks on this data, it should really just grab 
two mappers. Not to mention, I thought gzip wasn't splitable anyhow.  So, from 
that standpoint, how does it even send data to three mappers.  If you know of 
some secret cache of documentation for hive, I'd love to read it.

Thanks




Re: Hive File Sizes, Merging, and Splits

2012-09-25 Thread John Omernik
Isn't there an overhead associated with each map task?  Based on that, my
hypothesis is if I pay attention to may data, merge up small files after
load, and ensure split sizes are close to files sizes, I can keep the
number of map tasks to an absolute minimum.


On Tue, Sep 25, 2012 at 2:35 PM, Connell, Chuck wrote:

>  Why do you think the current generated code is inefficient? 
>
> ** **
>
> ** **
>
> ** **
>
> *From:* John Omernik [mailto:j...@omernik.com]
> *Sent:* Tuesday, September 25, 2012 2:57 PM
> *To:* user@hive.apache.org
> *Subject:* Hive File Sizes, Merging, and Splits
>
> ** **
>
> I am really struggling trying to make hears or tails out of how to
> optimize the data in my tables for best query times.  I have a partition
> that is compressed (Gzip) RCFile data in two files
>
> ** **
>
> total 421877
>
> 263715 -rwxr-xr-x 1 darkness darkness 270044140 2012-09-25 13:32 00_0*
> ***
>
> 158162 -rwxr-xr-x 1 darkness darkness 161956948 2012-09-25 13:32 01_0*
> ***
>
> ** **
>
> ** **
>
> ** **
>
> No matter what I set my split settings to prior to the job, I always get
> three mappers.  My block size is 268435456 but the setting doesn't seem to
> change anything. I can set split size huge or small with no apparent affect
> on the data.   
>
> ** **
>
> ** **
>
> I know there are many esoteric items here, but is there any good
> documentation on setting these things to make my queries on this data more
> efficient. I am not sure what it needs three map tasks on this data, it
> should really just grab two mappers. Not to mention, I thought gzip wasn't
> splitable anyhow.  So, from that standpoint, how does it even send data to
> three mappers.  If you know of some secret cache of documentation for hive,
> I'd love to read it. 
>
> ** **
>
> Thanks
>
> ** **
>


RE: Hive File Sizes, Merging, and Splits

2012-09-25 Thread Connell, Chuck
Why do you think the current generated code is inefficient?



From: John Omernik [mailto:j...@omernik.com]
Sent: Tuesday, September 25, 2012 2:57 PM
To: user@hive.apache.org
Subject: Hive File Sizes, Merging, and Splits

I am really struggling trying to make hears or tails out of how to optimize the 
data in my tables for best query times.  I have a partition that is compressed 
(Gzip) RCFile data in two files

total 421877
263715 -rwxr-xr-x 1 darkness darkness 270044140 2012-09-25 13:32 00_0
158162 -rwxr-xr-x 1 darkness darkness 161956948 2012-09-25 13:32 01_0



No matter what I set my split settings to prior to the job, I always get three 
mappers.  My block size is 268435456 but the setting doesn't seem to change 
anything. I can set split size huge or small with no apparent affect on the 
data.


I know there are many esoteric items here, but is there any good documentation 
on setting these things to make my queries on this data more efficient. I am 
not sure what it needs three map tasks on this data, it should really just grab 
two mappers. Not to mention, I thought gzip wasn't splitable anyhow.  So, from 
that standpoint, how does it even send data to three mappers.  If you know of 
some secret cache of documentation for hive, I'd love to read it.

Thanks



How can I get the constant value from the ObjectInspector in the UDF

2012-09-25 Thread java8964 java8964

Hi, I am using Cloudera release cdh3u3, which has the hive 0.71 version.
I am trying to write a hive UDF function as to calculate the moving sum. Right 
now, I am having trouble to get the constrant value passed in in the 
initialization stage.
For example, let's assume the function is like the following format:
msum(salary, 10) - salary is a int type column
which means the end user wants to calculate the last 10 rows of salary.
I kind of know how to implement this UDF. But I have one problem right now.
1) This is not a UDAF, as each row will return one data back as the moving 
sum.2) I create an UDF class extends from the GenericUDF.3) I can get the 
column type from the ObjectInspector[] passed to me in the initialize() method 
to verify that 'salary' and 10 both needs to be numeric type (later one needs 
to be integer)4) But I also want to get the real value of 10, in this case, in 
the initialize() stage, so I can create the corresponding data structure based 
on the value end user specified here.5) I looks around the javadoc of 
ObjectInspector class. I know at run time the real class of the 2nd parameter 
is WritableIntObjectInspector. I can get the type, but how I can get the real 
value of it?6) This is kind of ConstantsObjectInspector, should be able to give 
the value to me, as it already knows the type is int. What how?7) I don't want 
to try to get the value at the evaluate stage. Can I get this value at the 
initialize stage?
Thanks
Yong  

Hive File Sizes, Merging, and Splits

2012-09-25 Thread John Omernik
I am really struggling trying to make hears or tails out of how to optimize
the data in my tables for best query times.  I have a partition that is
compressed (Gzip) RCFile data in two files

total 421877
263715 -rwxr-xr-x 1 darkness darkness 270044140 2012-09-25 13:32 00_0
158162 -rwxr-xr-x 1 darkness darkness 161956948 2012-09-25 13:32 01_0



No matter what I set my split settings to prior to the job, I always get
three mappers.  My block size is 268435456 but the setting doesn't seem to
change anything. I can set split size huge or small with no apparent affect
on the data.


I know there are many esoteric items here, but is there any good
documentation on setting these things to make my queries on this data more
efficient. I am not sure what it needs three map tasks on this data, it
should really just grab two mappers. Not to mention, I thought gzip wasn't
splitable anyhow.  So, from that standpoint, how does it even send data to
three mappers.  If you know of some secret cache of documentation for hive,
I'd love to read it.

Thanks


Re: How connect to hive server without using jdbc

2012-09-25 Thread Abhishek
Hi Doug,

Thanks for the reply.Can you point me to the CLI code.

Regards 
Abhi

Sent from my iPhone

On Sep 25, 2012, at 1:53 PM, Doug Houck  wrote:

> Also, this is all open source, right?  So you could take a look at the CLI 
> code to see how it does it.
> 
> - Original Message -
> From: "Abhishek" 
> To: user@hive.apache.org
> Cc: user@hive.apache.org
> Sent: Tuesday, September 25, 2012 1:44:41 PM
> Subject: Re: How connect to hive server without using jdbc
> 
> 
> Hi Zhou, 
> 
> 
> But Iam looking to connect to hive server without jdbc. Some other way 
> through API 
> 
> 
> Regards 
> Abhi 
> 
> Sent from my iPhone 
> 
> On Sep 25, 2012, at 1:15 PM, Haijia Zhou < leons...@gmail.com > wrote: 
> 
> 
> 
> 
> https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC 
> 
> 
> On Tue, Sep 25, 2012 at 1:13 PM, Abhishek < abhishek.dod...@gmail.com > 
> wrote: 
> 
> 
> Hi all, 
> 
> Is there any way to connect to hive server through API?? 
> 
> Regards 
> Abhi 
> 
> 
> 
> Sent from my iPhone 
> 


Re: How connect to hive server without using jdbc

2012-09-25 Thread Abhishek
Hi Zhou,

Thanks for the reply, we are shutting down thrift service due to security 
issues with hive.

Regards
Abhi

Sent from my iPhone

On Sep 25, 2012, at 1:53 PM, Doug Houck  wrote:

> Also, this is all open source, right?  So you could take a look at the CLI 
> code to see how it does it.
> 
> - Original Message -
> From: "Abhishek" 
> To: user@hive.apache.org
> Cc: user@hive.apache.org
> Sent: Tuesday, September 25, 2012 1:44:41 PM
> Subject: Re: How connect to hive server without using jdbc
> 
> 
> Hi Zhou, 
> 
> 
> But Iam looking to connect to hive server without jdbc. Some other way 
> through API 
> 
> 
> Regards 
> Abhi 
> 
> Sent from my iPhone 
> 
> On Sep 25, 2012, at 1:15 PM, Haijia Zhou < leons...@gmail.com > wrote: 
> 
> 
> 
> 
> https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC 
> 
> 
> On Tue, Sep 25, 2012 at 1:13 PM, Abhishek < abhishek.dod...@gmail.com > 
> wrote: 
> 
> 
> Hi all, 
> 
> Is there any way to connect to hive server through API?? 
> 
> Regards 
> Abhi 
> 
> 
> 
> Sent from my iPhone 
> 


Re: How connect to hive server without using jdbc

2012-09-25 Thread Doug Houck
Also, this is all open source, right?  So you could take a look at the CLI code 
to see how it does it.

- Original Message -
From: "Abhishek" 
To: user@hive.apache.org
Cc: user@hive.apache.org
Sent: Tuesday, September 25, 2012 1:44:41 PM
Subject: Re: How connect to hive server without using jdbc


Hi Zhou, 


But Iam looking to connect to hive server without jdbc. Some other way through 
API 


Regards 
Abhi 

Sent from my iPhone 

On Sep 25, 2012, at 1:15 PM, Haijia Zhou < leons...@gmail.com > wrote: 




https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC 


On Tue, Sep 25, 2012 at 1:13 PM, Abhishek < abhishek.dod...@gmail.com > wrote: 


Hi all, 

Is there any way to connect to hive server through API?? 

Regards 
Abhi 



Sent from my iPhone 



Re: How connect to hive server without using jdbc

2012-09-25 Thread Haijia Zhou
the page also contains information about using other APIs to connect to
Hive.

On Tue, Sep 25, 2012 at 1:44 PM, Abhishek  wrote:

> Hi Zhou,
>
> But Iam looking to connect to hive server without jdbc. Some other way
> through API
>
> Regards
> Abhi
>
> Sent from my iPhone
>
> On Sep 25, 2012, at 1:15 PM, Haijia Zhou  wrote:
>
> https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC
>
> On Tue, Sep 25, 2012 at 1:13 PM, Abhishek wrote:
>
>> Hi all,
>>
>> Is there any way to connect to hive server through API??
>>
>> Regards
>> Abhi
>>
>>
>>
>> Sent from my iPhone
>>
>
>


Re: How connect to hive server without using jdbc

2012-09-25 Thread Abhishek
Hi Zhou,

But Iam looking to connect to hive server without jdbc. Some other way through 
API

Regards
Abhi

Sent from my iPhone

On Sep 25, 2012, at 1:15 PM, Haijia Zhou  wrote:

> https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC
> 
> On Tue, Sep 25, 2012 at 1:13 PM, Abhishek  wrote:
>> Hi all,
>> 
>> Is there any way to connect to hive server through API??
>> 
>> Regards
>> Abhi
>> 
>> 
>> 
>> Sent from my iPhone
> 


Re: How connect to hive server without using jdbc

2012-09-25 Thread Haijia Zhou
https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC

On Tue, Sep 25, 2012 at 1:13 PM, Abhishek  wrote:

> Hi all,
>
> Is there any way to connect to hive server through API??
>
> Regards
> Abhi
>
>
>
> Sent from my iPhone
>


How connect to hive server without using jdbc

2012-09-25 Thread Abhishek
Hi all,

Is there any way to connect to hive server through API??

Regards
Abhi



Sent from my iPhone


Re: Custom MR scripts using java in Hive

2012-09-25 Thread Bertrand Dechoux
For java, you can also consider reflection :
http://hive.apache.org/docs/r0.9.0/udf/reflect.html

Regards

Bertrand

On Tue, Sep 25, 2012 at 3:18 PM, Tamil A <4tamil...@gmail.com> wrote:

> Hi Manish,
>
> Thanks for your help.I did the same using UDF.Now trying with
> Transform,Map and Reduce clauses.so is it mean by using java we have to
> goahead through UDF and for other languages using MapReduce Scripts i.e.,
> the Transform,Map and Reduce clauses.
>  Please correct me if am wrong.
>
>
>
>
> Thanks & Regards,
>
> Manu 
>
> On Tue, Sep 25, 2012 at 5:19 PM, Manish.Bhoge wrote:
>
>>  Manu,
>>
>> ** **
>>
>> If you have written UDF in Java for Hive then you need to copy your JAR
>> on your Hadoop cluster in /usr/lib/hive/lib/ folder to hive to use this
>> JAR. 
>>
>> ** **
>>
>> Thank You,
>>
>> Manish
>>
>> ** **
>>
>> *From:* Manu A [mailto:hadoophi...@gmail.com]
>> *Sent:* Tuesday, September 25, 2012 3:44 PM
>> *To:* user@hive.apache.org
>> *Subject:* Custom MR scripts using java in Hive
>>
>> ** **
>>
>> Hi All,
>>
>> I am learning hive. Please let me know if any one tried with custom Map
>> Reduce scripts using java in hive or refer me some links and blogs with an
>> example.
>>
>>  
>>
>> when i tried i got the below error :
>>
>>  
>>
>> Hadoop job information for Stage-1: number of mappers: 1; number of
>> reducers: 0
>> 2012-09-25 02:47:23,720 Stage-1 map = 0%,  reduce = 0%
>> 2012-09-25 02:47:56,943 Stage-1 map = 100%,  reduce = 100%
>> Ended Job = job_20120931_0001 with errors
>> Error during job, obtaining debugging information...
>> Examining task ID: task_20120931_0001_m_02 (and more) from job
>> job_20120931_0001
>> Exception in thread "Thread-51" java.lang.RuntimeException: Error while
>> reading from task log url
>> at
>> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
>> at
>> org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
>> at
>> org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
>> at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.io.IOException: Server returned HTTP response code: 400
>> for URL: // removed as confidential
>>
>> at
>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1313)
>> at java.net.URL.openStream(URL.java:1010)
>> at
>> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
>> ... 3 more
>> FAILED: Execution Error, return code 2 from
>> org.apache.hadoop.hive.ql.exec.MapRedTask
>> MapReduce Jobs Launched:
>> Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
>> Total MapReduce CPU Time Spent: 0 msec
>>
>>  
>>
>>  
>>
>>  
>>
>>  
>>
>> Thanks for ur help in advance :)
>>
>>  
>>
>>  
>>
>>  
>>
>> Thanks & Regards,
>>
>> Manu 
>>
>>  
>>
>>  
>>
>>  
>>
>>  
>>
>> ** **
>>
>> ** **
>>
>
>
>
> --
> *Thanks & Regards,*
> *Tamil*
>
>


-- 
Bertrand Dechoux


Re: Custom MR scripts using java in Hive

2012-09-25 Thread Manu A
Thanks Manish. ll try with the same.



Thanks & Regards,

Manu 

 


On Tue, Sep 25, 2012 at 5:19 PM, Manish.Bhoge wrote:

>  Manu,
>
> ** **
>
> If you have written UDF in Java for Hive then you need to copy your JAR on
> your Hadoop cluster in /usr/lib/hive/lib/ folder to hive to use this JAR.
> 
>
> ** **
>
> Thank You,
>
> Manish
>
> ** **
>
> *From:* Manu A [mailto:hadoophi...@gmail.com]
> *Sent:* Tuesday, September 25, 2012 3:44 PM
> *To:* user@hive.apache.org
> *Subject:* Custom MR scripts using java in Hive
>
> ** **
>
> Hi All,
>
> I am learning hive. Please let me know if any one tried with custom Map
> Reduce scripts using java in hive or refer me some links and blogs with an
> example.
>
>  
>
> when i tried i got the below error :
>
>  
>
> Hadoop job information for Stage-1: number of mappers: 1; number of
> reducers: 0
> 2012-09-25 02:47:23,720 Stage-1 map = 0%,  reduce = 0%
> 2012-09-25 02:47:56,943 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_20120931_0001 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_20120931_0001_m_02 (and more) from job
> job_20120931_0001
> Exception in thread "Thread-51" java.lang.RuntimeException: Error while
> reading from task log url
> at
> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
> at
> org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
> at
> org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Server returned HTTP response code: 400
> for URL: // removed as confidential
>
> at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1313)
> at java.net.URL.openStream(URL.java:1010)
> at
> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
> ... 3 more
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.MapRedTask
> MapReduce Jobs Launched:
> Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec
>
>  
>
>  
>
>  
>
>  
>
> Thanks for ur help in advance :)
>
>  
>
>  
>
>  
>
> Thanks & Regards,
>
> Manu 
>
>  
>
>  
>
>  
>
>  
>
> ** **
>
> ** **
>


Re: Custom MR scripts using java in Hive

2012-09-25 Thread Tamil A
Hi Manish,

Thanks for your help.I did the same using UDF.Now trying with Transform,Map
and Reduce clauses.so is it mean by using java we have to goahead through
UDF and for other languages using MapReduce Scripts i.e., the Transform,Map
and Reduce clauses.
Please correct me if am wrong.




Thanks & Regards,

Manu 

On Tue, Sep 25, 2012 at 5:19 PM, Manish.Bhoge wrote:

>  Manu,
>
> ** **
>
> If you have written UDF in Java for Hive then you need to copy your JAR on
> your Hadoop cluster in /usr/lib/hive/lib/ folder to hive to use this JAR.
> 
>
> ** **
>
> Thank You,
>
> Manish
>
> ** **
>
> *From:* Manu A [mailto:hadoophi...@gmail.com]
> *Sent:* Tuesday, September 25, 2012 3:44 PM
> *To:* user@hive.apache.org
> *Subject:* Custom MR scripts using java in Hive
>
> ** **
>
> Hi All,
>
> I am learning hive. Please let me know if any one tried with custom Map
> Reduce scripts using java in hive or refer me some links and blogs with an
> example.
>
>  
>
> when i tried i got the below error :
>
>  
>
> Hadoop job information for Stage-1: number of mappers: 1; number of
> reducers: 0
> 2012-09-25 02:47:23,720 Stage-1 map = 0%,  reduce = 0%
> 2012-09-25 02:47:56,943 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_20120931_0001 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_20120931_0001_m_02 (and more) from job
> job_20120931_0001
> Exception in thread "Thread-51" java.lang.RuntimeException: Error while
> reading from task log url
> at
> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
> at
> org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
> at
> org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Server returned HTTP response code: 400
> for URL: // removed as confidential
>
> at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1313)
> at java.net.URL.openStream(URL.java:1010)
> at
> org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
> ... 3 more
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.MapRedTask
> MapReduce Jobs Launched:
> Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec
>
>  
>
>  
>
>  
>
>  
>
> Thanks for ur help in advance :)
>
>  
>
>  
>
>  
>
> Thanks & Regards,
>
> Manu 
>
>  
>
>  
>
>  
>
>  
>
> ** **
>
> ** **
>



-- 
*Thanks & Regards,*
*Tamil*


Re: Hive query failing

2012-09-25 Thread kulkarni . swarnim
The jar is being looked on HDFS as the exception suggests. Run the following 
commands:

$ hadoop fs -mkdir /usr/lib/hive/lib
$ hadoop fs -put $HIVE_HOME/lib/hive-builtins-0.8.1-cdh4.0.1.jar 
/usr/lib/hive/lib

Your queries should work now.

On Sep 25, 2012, at 6:46 AM, Manish.Bhoge  wrote:

> Sarath,
>  
> Is this the external table where you have ran the query? How did you loaded 
> the table?  Because it looks like the error is about the file related to 
> table than CDH Jar.
>  
> Thank You,
> Manish
>  
> From: Sarath [mailto:sarathchandra.jos...@algofusiontech.com] 
> Sent: Tuesday, September 25, 2012 3:48 PM
> To: user@hive.apache.org
> Subject: Hive query failing
>  
> Hi,
> 
> When I run the query "select count(1) from table1;" it fails with the 
> exception message as below.
> 
> java.io.FileNotFoundException: File does not exist: 
> /usr/lib/hive/lib/hive-builtins-0.8.1-cdh4.0.1.jar
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
> at 
> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:245)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:283)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:354)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:609)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:604)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:604)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:710)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> Job Submission failed with exception 'java.io.FileNotFoundException(File does 
> not exist: /usr/lib/hive/lib/hive-builtins-0.8.1-cdh4.0.1.jar)'
> Execution failed with exit status: 2
> 
> But the JAR mentioned in the message exists and appropriate R/W permissions 
> have been set on the folder /usr/lib/hive for the user.
> What is going wrong?
> 
> Regards,
> Sarath.


RE: Custom MR scripts using java in Hive

2012-09-25 Thread Manish . Bhoge
Manu,

If you have written UDF in Java for Hive then you need to copy your JAR on your 
Hadoop cluster in /usr/lib/hive/lib/ folder to hive to use this JAR.

Thank You,
Manish

From: Manu A [mailto:hadoophi...@gmail.com]
Sent: Tuesday, September 25, 2012 3:44 PM
To: user@hive.apache.org
Subject: Custom MR scripts using java in Hive

Hi All,
I am learning hive. Please let me know if any one tried with custom Map Reduce 
scripts using java in hive or refer me some links and blogs with an example.

when i tried i got the below error :

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2012-09-25 02:47:23,720 Stage-1 map = 0%,  reduce = 0%
2012-09-25 02:47:56,943 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_20120931_0001 with errors
Error during job, obtaining debugging information...
Examining task ID: task_20120931_0001_m_02 (and more) from job 
job_20120931_0001
Exception in thread "Thread-51" java.lang.RuntimeException: Error while reading 
from task log url
at 
org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
at 
org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Server returned HTTP response code: 400 for 
URL: // removed as confidential
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1313)
at java.net.URL.openStream(URL.java:1010)
at 
org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
... 3 more
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec




Thanks for ur help in advance :)



Thanks & Regards,
Manu








RE: Hive query failing

2012-09-25 Thread Manish . Bhoge
Sarath,

Is this the external table where you have ran the query? How did you loaded the 
table?  Because it looks like the error is about the file related to table than 
CDH Jar.

Thank You,
Manish

From: Sarath [mailto:sarathchandra.jos...@algofusiontech.com]
Sent: Tuesday, September 25, 2012 3:48 PM
To: user@hive.apache.org
Subject: Hive query failing

Hi,

When I run the query "select count(1) from table1;" it fails with the exception 
message as below.

java.io.FileNotFoundException: File does not exist: 
/usr/lib/hive/lib/hive-builtins-0.8.1-cdh4.0.1.jar
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
at 
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:245)
at 
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:283)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:354)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:609)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:604)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:604)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:710)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Job Submission failed with exception 'java.io.FileNotFoundException(File does 
not exist: /usr/lib/hive/lib/hive-builtins-0.8.1-cdh4.0.1.jar)'
Execution failed with exit status: 2

But the JAR mentioned in the message exists and appropriate R/W permissions 
have been set on the folder /usr/lib/hive for the user.
What is going wrong?

Regards,
Sarath.