date:20150325

hyphen in hive struct field

2015-03-25 Thread Udit Mehta

Hi,

I have a hive table query:

create external table test (field1 struct `inner-table` : string);

I believe hyphens are disallowed  but to overcome that i read that we can
use ``(ticks) around them. But even this seems to fail.

Is there a way around this or hypens are not allowed in nested hive tables?

Thanks,
Udit

Can WebHCat show non-MapReduce jobs?

2015-03-25 Thread Xiaoyong Zhu

It seems that WebHCat could only show the Map Reduce jobs - for example, if I 
submit a Hive on Tez job via WebHCat, I can only get the TempletonControllerJob 
ID (which is a MAPREDUCE job) but I cannot get the Tez job ID (which is 
launched by TempletonControllerJob).

Is this by design? Is there a way to return all type of jobs via WebHCat?

Xiaoyong

RE: Executing HQL files from JAVA application.

2015-03-25 Thread Amal Gupta

Hi Gopal,

Thanks a lot. 
Connectivity to the HiveServer2 was not an issue. We were able to connect using 
the example that you shared and using Beeline. The issue is a script execution 
from java app. May be I missed something, but I was not able to find an 
efficient and elegant way to execute hive scripts placed at a specific location 
from the java app.  

The scenario is 
App Placed at a location A should be able to connect to hiveServer2 at B and 
execute script TestHive.hql placed at a location on B(say 
root/testProject/hive/scripts/TestHive.hql).

Regards,
Amal

-Original Message-
From: Gopal Vijayaraghavan [mailto:go...@hortonworks.com] On Behalf Of Gopal 
Vijayaraghavan
Sent: Wednesday, March 25, 2015 8:49 AM
To: user@hive.apache.org
Cc: Amal Gupta
Subject: Re: Executing HQL files from JAVA application.

Hi,

Any mechanism which bypasses schema layers for SQL is a bad idea.

See this example for how you can connect to HiveServer2 directly from Java
-
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveSe
rver2Clients-JDBCClientSampleCode

Use the JDBC driver to access HiveServer2 through a first-class Java API.

If you find any performance issues with this method, let me know and I can fix 
it.

Cheers,
Gopal

From:  Amal Gupta amal.gup...@aexp.com
Reply-To:  user@hive.apache.org user@hive.apache.org
Date:  Sunday, March 22, 2015 at 10:53 PM
To:  user@hive.apache.org user@hive.apache.org
Subject:  RE: Executing HQL files from JAVA application.


Hey Mich,

 
Got any clues regarding the failure of the code that I sent?
 
I was going through the project and the code again and I suspect the 
mis-matching dependencies to be the culprits. I am currently trying to re-align 
the dependencies  as per the pom given on the mvnrepository.com while trying to 
see if a particular configuration succeeds.
 
Will keep you posted on my progress.  Thanks again for all the help that you 
are providing.
J
 
Regards,
Amal

 
From: Amal Gupta

Sent: Sunday, March 22, 2015 7:52 AM
To: user@hive.apache.org
Subject: RE: Executing HQL files from JAVA application.


 
Hi Mich,
 
J A coincidence. Even I am new to hive. My test script which I am trying to 
execute  contains a drop and a create statement.
 
Script :-

use test_db;
DROP TABLE IF EXISTS demoHiveTable;
CREATE EXTERNAL TABLE demoHiveTable (
demoId string,
demoName string
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'

STORED AS TEXTFILE LOCATION '/hive/';
 
 
Java Code: -
Not sure whether this will have an impact but the the code is a part of Spring 
batch Tasklet being triggered from the Batch-Context. This tasklet runs in 
parallel with other tasklets.
 
  
public RepeatStatus execute(StepContribution arg0, ChunkContext arg1)
   
throws Exception {
String[] args =
{-d,BeeLine.BEELINE_DEFAULT_JDBC_DRIVER,-u,jdbc:hive2://server-name:
1/test_db,
   
-n,**,-p,**,
-f,C://Work//test_hive.hql};

BeeLine
beeline = new BeeLine();
ByteArrayOutputStream os = new ByteArrayOutputStream();
PrintStream beelineOutputStream = new PrintStream(os);
beeline.setOutputStream(beelineOutputStream);
beeline.setErrorStream(beelineOutputStream);
beeline.begin(args,null);
String output = os.toString(UTF8);
System.out.println(output);

return RepeatStatus.FINISHED;
   }
 
It will be great if you can share the piece of code that worked for you.
May be it will give me some pointers on how to go ahead.

 
Best Regards,
Amal

 
From: Mich Talebzadeh [mailto:m...@peridale.co.uk]

Sent: Sunday, March 22, 2015 2:58 AM
To: user@hive.apache.org
Subject: RE: Executing HQL files from JAVA application.


 
Hi Amal;
 
Me coming from relational database (Oracle, Sybase) background J always expect 
that a DDL statement like DROP TABLE has to run in its own transaction and 
cannot be  combined with a DML statement.
 
Now I suspect that when you run the command DROP TABLE IF EXIASTS TABLE_NAME; 
 like below in beehive it works
 
0: jdbc:hive2://rhes564:10010/default drop table if exists mytest; No rows 
affected (0.216 seconds)
 
That runs in its own transaction so it works. However, I suspect in JAVA that 
is not the case. Can you possibly provide your JAVA code to see what exactly it 
is  doing.
 
Thanks,
 
Mich
 
http://talebzadehmich.wordpress.com
 
Publications due shortly:
Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and 
Coherence Cache
 
NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended  
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries 
or their employees, unless expressly

Re: Can WebHCat show non-MapReduce jobs?

2015-03-25 Thread Eugene Koifman

https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Jobs should 
produce all jobs (assuming the calling user has permissions to see them).
templeton.Server.showJobList() has detailed JavaDoc

From: Xiaoyong Zhu xiaoy...@microsoft.commailto:xiaoy...@microsoft.com
Reply-To: user@hive.apache.orgmailto:user@hive.apache.org 
user@hive.apache.orgmailto:user@hive.apache.org
Date: Wednesday, March 25, 2015 at 5:35 AM
To: user@hive.apache.orgmailto:user@hive.apache.org 
user@hive.apache.orgmailto:user@hive.apache.org
Subject: Can WebHCat show non-MapReduce jobs?

It seems that WebHCat could only show the Map Reduce jobs - for example, if I 
submit a Hive on Tez job via WebHCat, I can only get the TempletonControllerJob 
ID (which is a MAPREDUCE job) but I cannot get the Tez job ID (which is 
launched by TempletonControllerJob).

Is this by design? Is there a way to return all type of jobs via WebHCat?

Xiaoyong

How to read Protobuffers in Hive

2015-03-25 Thread Lukas Nalezenec

Hi,
I am trying to write Serde + ObjectInspectors for reading Protobuffers
in Hive.
I tried to use class ProtocolBuffersStructObjectInspector from Hive but
it last worked with old protobuffer version 2.3.
I tried to use ObjectInspector from Twitter Elephant-bird but it does
not work too.

It looks like that it could help if I havent used DynamicMessage$Builder
. The problem is that DynamicMessage$Builder cannot be reused.
When message is build from builder the builder field fields is set tu
null and it throws NPE on second build() call.

...
at
com.lukas.AbstractProtobufStructObjectInspector.setStructFieldData_default(AbstractProtobufStructObjectInspector.java:253)
at
com.lukas.ProtobufStructObjectInspector.setStructFieldData(ProtobufStructObjectInspector.java:61)
at
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:325)
at
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:324)
at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:630)
... 9 more
Caused by: java.lang.NullPointerException
at
com.google.protobuf.DynamicMessage$Builder.clearField(DynamicMessage.java:386)
at
com.google.protobuf.DynamicMessage$Builder.clearField(DynamicMessage.java:252)
at
com.lukas.AbstractProtobufStructObjectInspector.setStructFieldData_default(AbstractProtobufStructObjectInspector.java:177)
... 13 more

I am using Hive 10 but I am also interested in solution for Hive 13.

Does anybody have working Protobuffer Serde ?

Thanks
Best Regards

Lukas

Re: How to read Protobuffers in Hive

2015-03-25 Thread Edward Capriolo

You may be able to use:
https://github.com/edwardcapriolo/hive-protobuf

(Use the branch not master)

This code is based on the avro support. It works well even with nested
objects.

On Wed, Mar 25, 2015 at 12:28 PM, Lukas Nalezenec
lukas.naleze...@firma.seznam.cz wrote:

Hi,
I am trying to write Serde + ObjectInspectors for reading Protobuffers in
Hive.
I tried to use class ProtocolBuffersStructObjectInspector from Hive but it
last worked with old protobuffer version 2.3.
I tried to use ObjectInspector from Twitter Elephant-bird but it does not
work too.

It looks like that it could help if I havent used DynamicMessage$Builder .
The problem is that DynamicMessage$Builder cannot be reused.
When message is build from builder the builder field fields is set tu
null and it throws NPE on second build() call.

I am using Hive 10 but I am also interested in solution for Hive 13.

Does anybody have working Protobuffer Serde ?

Thanks
Best Regards

Lukas

Re: Intermittent BindException during long MR jobs

2015-03-25 Thread Krishna Rao

Thanks for the responses. In our case the port is 0, and so from the link
http://wiki.apache.org/hadoop/BindException Ted mentioned it says that a
collision is highly unlikely:

If the port is 0, then the OS is looking for any free port -so the
port-in-use and port-below-1024 problems are highly unlikely to be the
cause of the problem.

I think load may be the culprit since the nodes will be heavily used during
the times that the exception occurs.

Is there anyway to set/increase the timeout for the call/connection
attempt? In all cases so far it seems to be on a call to delete a file in
HDFS. I had a search through the HDFS code base but couldn't see an obvious
way to set a timeout, and couldn't see it being set.

Krishna

On 28 February 2015 at 15:20, Ted Yu yuzhih...@gmail.com wrote:

 Krishna:
 Please take a look at:
 http://wiki.apache.org/hadoop/BindException

 Cheers

 On Thu, Feb 26, 2015 at 10:30 PM, hadoop.supp...@visolve.com wrote:

 Hello Krishna,



 Exception seems to be IP specific. It might be occurred due to
 unavailability of IP address in the system to assign. Double check the IP
 address availability and run the job.



 *Thanks,*

 *S.RagavendraGanesh*

 ViSolve Hadoop Support Team
 ViSolve Inc. | San Jose, California
 Website: www.visolve.com

 email: servi...@visolve.com | Phone: 408-850-2243





 *From:* Krishna Rao [mailto:krishnanj...@gmail.com]
 *Sent:* Thursday, February 26, 2015 9:48 PM
 *To:* user@hive.apache.org; u...@hadoop.apache.org
 *Subject:* Intermittent BindException during long MR jobs



 Hi,



 we occasionally run into a BindException causing long running jobs to
 occasionally fail.



 The stacktrace is below.



 Any ideas what this could be caused by?



 Cheers,



 Krishna





 Stacktrace:

 379969 [Thread-980] ERROR org.apache.hadoop.hive.ql.exec.Task  - Job
 Submission failed with exception 'java.net.BindException(Problem binding to
 [back10/10.4.2.10:0] java.net.BindException: Cann

 ot assign requested address; For more details see:
 http://wiki.apache.org/hadoop/BindException)'

 java.net.BindException: Problem binding to [back10/10.4.2.10:0]
 java.net.BindException: Cannot assign requested address; For more details
 see:  http://wiki.apache.org/hadoop/BindException

 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:718)

 at org.apache.hadoop.ipc.Client.call(Client.java:1242)

 at
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)

 at com.sun.proxy.$Proxy10.create(Unknown Source)

 at
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:193)

 at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

 at java.lang.reflect.Method.invoke(Method.java:597)

 at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)

 at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)

 at com.sun.proxy.$Proxy11.create(Unknown Source)

 at
 org.apache.hadoop.hdfs.DFSOutputStream.init(DFSOutputStream.java:1376)

 at
 org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1395)

 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1255)

 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)

 at
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)

 at
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)

 at
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)

 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:888)

 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:869)

 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:768)

 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:757)

 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:558)

 at
 org.apache.hadoop.mapreduce.split.JobSplitWriter.createFile(JobSplitWriter.java:96)

 at
 org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:85)

 at
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:517)

 at
 org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:487)

 at
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:369)

 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1286)

 at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1283)

 at java.security.AccessController.doPrivileged(Native Method)

 at javax.security.auth.Subject.doAs(Subject.java:396)

 at

Re: Executing HQL files from JAVA application.

2015-03-25 Thread Steve Howard

I would argue that executing arbitrary code from a random remote server has 
just increased your security scope footprint in terms of the need to control 
another access point.

Purely out of curiosity, is there a compelling architectural reason or 
environment limitation that results in your need to do it this way?

Sent from my iPad

 On Mar 25, 2015, at 9:57 AM, Amal Gupta amal.gup...@aexp.com wrote:
 
 Hi Gopal,
 
 Thanks a lot. 
 Connectivity to the HiveServer2 was not an issue. We were able to connect 
 using the example that you shared and using Beeline. The issue is a script 
 execution from java app. May be I missed something, but I was not able to 
 find an efficient and elegant way to execute hive scripts placed at a 
 specific location from the java app.  
 
 The scenario is 
 App Placed at a location A should be able to connect to hiveServer2 at B and 
 execute script TestHive.hql placed at a location on B(say 
 root/testProject/hive/scripts/TestHive.hql).
 
 Regards,
 Amal
 
 -Original Message-
 From: Gopal Vijayaraghavan [mailto:go...@hortonworks.com] On Behalf Of Gopal 
 Vijayaraghavan
 Sent: Wednesday, March 25, 2015 8:49 AM
 To: user@hive.apache.org
 Cc: Amal Gupta
 Subject: Re: Executing HQL files from JAVA application.
 
 Hi,
 
 Any mechanism which bypasses schema layers for SQL is a bad idea.
 
 See this example for how you can connect to HiveServer2 directly from Java
 -
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveSe
 rver2Clients-JDBCClientSampleCode
 
 Use the JDBC driver to access HiveServer2 through a first-class Java API.
 
 If you find any performance issues with this method, let me know and I can 
 fix it.
 
 Cheers,
 Gopal
 
 From:  Amal Gupta amal.gup...@aexp.com
 Reply-To:  user@hive.apache.org user@hive.apache.org
 Date:  Sunday, March 22, 2015 at 10:53 PM
 To:  user@hive.apache.org user@hive.apache.org
 Subject:  RE: Executing HQL files from JAVA application.
 
 
 Hey Mich,
 
 
 Got any clues regarding the failure of the code that I sent?
 
 I was going through the project and the code again and I suspect the 
 mis-matching dependencies to be the culprits. I am currently trying to 
 re-align the dependencies  as per the pom given on the mvnrepository.com 
 while trying to see if a particular configuration succeeds.
 
 Will keep you posted on my progress.  Thanks again for all the help that you 
 are providing.
 J
 
 Regards,
 Amal
 
 
 From: Amal Gupta
 
 Sent: Sunday, March 22, 2015 7:52 AM
 To: user@hive.apache.org
 Subject: RE: Executing HQL files from JAVA application.
 
 
 
 Hi Mich,
 
 J A coincidence. Even I am new to hive. My test script which I am trying to 
 execute  contains a drop and a create statement.
 
 Script :-
 
 use test_db;
 DROP TABLE IF EXISTS demoHiveTable;
 CREATE EXTERNAL TABLE demoHiveTable (
 demoId string,
 demoName string
 ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
 
 STORED AS TEXTFILE LOCATION '/hive/';
 
 
 Java Code: -
 Not sure whether this will have an impact but the the code is a part of 
 Spring batch Tasklet being triggered from the Batch-Context. This tasklet 
 runs in parallel with other tasklets.
 
 
 public RepeatStatus execute(StepContribution arg0, ChunkContext arg1)
 
 throws Exception {
String[] args =
 {-d,BeeLine.BEELINE_DEFAULT_JDBC_DRIVER,-u,jdbc:hive2://server-name:
 1/test_db,
 
 -n,**,-p,**,
 -f,C://Work//test_hive.hql};
 
BeeLine
 beeline = new BeeLine();
ByteArrayOutputStream os = new ByteArrayOutputStream();
PrintStream beelineOutputStream = new PrintStream(os);
beeline.setOutputStream(beelineOutputStream);
beeline.setErrorStream(beelineOutputStream);
beeline.begin(args,null);
String output = os.toString(UTF8);
System.out.println(output);
 
 return RepeatStatus.FINISHED;
   }
 
 It will be great if you can share the piece of code that worked for you.
 May be it will give me some pointers on how to go ahead.
 
 
 Best Regards,
 Amal
 
 
 From: Mich Talebzadeh [mailto:m...@peridale.co.uk]
 
 Sent: Sunday, March 22, 2015 2:58 AM
 To: user@hive.apache.org
 Subject: RE: Executing HQL files from JAVA application.
 
 
 
 Hi Amal;
 
 Me coming from relational database (Oracle, Sybase) background J always 
 expect that a DDL statement like DROP TABLE has to run in its own transaction 
 and cannot be  combined with a DML statement.
 
 Now I suspect that when you run the command DROP TABLE IF EXIASTS 
 TABLE_NAME;  like below in beehive it works
 
 0: jdbc:hive2://rhes564:10010/default drop table if exists mytest; No rows 
 affected (0.216 seconds)
 
 That runs in its own transaction so it works. However, I suspect in JAVA that 
 is not the case. Can you possibly provide your JAVA code to see what exactly 
 it is  doing.
 
 Thanks,
 
 Mich

Re:

2015-03-25 Thread Alan Gates


If you want off of the list send email to user-unsubscr...@hive.apache.org

Alan.


jake lawson mailto:jacobj...@gmail.com
March 25, 2015 at 15:45
Stop emailing me

[no subject]

2015-03-25 Thread jake lawson

Stop emailing me

Re:

2015-03-25 Thread Al Pivonka

Why
Who's emailing you
On Mar 25, 2015 6:59 PM, Alan Gates alanfga...@gmail.com wrote:

 If you want off of the list send email to user-unsubscr...@hive.apache.org

 Alan.

   jake lawson jacobj...@gmail.com
  March 25, 2015 at 15:45
 Stop emailing me

RE: Can WebHCat show non-MapReduce jobs?