Looks like the oozie we ship might not be able to launch pig jobs that
use HCatLoader .
(Rajesh is ex-yahoo who now works now for Intuit, and working on getting
hcat used there .)
-Thejas
-------- Original Message --------
Subject: Re: HCatlog Security Tokens with Oozie
Date: Tue, 10 Jul 2012 06:22:52 +0530
From: Rajesh Balamohan <[email protected]>
To: Thejas Nair <[email protected]>
Hi Thejas,
Thought of updating you on this.
I implemented the patch available in
https://issues.apache.org/jira/browse/OOZIE-889 and made the changes
for <credentials> in oozie workflow. Oozie 3.1.3 had to be installed.
I had to add the following in oozie-site.xml
<property>
<name>oozie.credentials.credentialclasses</name>
<value>hcat=org.apache.oozie.action.hadoop.HCatCredentials</value>
<description>
A list of credential class mapping for CredentialsProvider
</description>
</property>
Hcatalog works perfectly with oozie (3.1.3) now. :)
~Rajesh.B
On Sat, Jul 7, 2012 at 11:51 AM, Rajesh Balamohan
<[email protected] <mailto:[email protected]>> wrote:
Thanks for the quick reply Thejas,
I have the following property set in core-site.xml, hive-site.xml
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>*</value>
</property>
Apart from these, are there any other setting that needs to be
done?. I run hcat_server.sh as root user. hive-site.xml also has
hive-keytab which loads hive@_HOST/CORP.INTUIT.NET
<http://CORP.INTUIT.NET> principal.
Oozie is able to run jobs in this secure cluster with normal pig
scripts. However, the moment i add HCatLoader, it tries to get
HiveMetaStoreClient() where it bombs with the GSS exception.
I have a hunch that the delegation tokens are either not created or
not passed correctly due to which this is happening in oozie.
Debugging thrift calls is turning out to be a challenge.
~Rajesh.B
On Sat, Jul 7, 2012 at 10:53 AM, Thejas Nair <[email protected]
<mailto:[email protected]>> wrote:
Hi Rajesh,
Have you configured hcat to let oozie proxy as other users ?
I have tested that it works through templeton, which is similar
to working through oozie.
You would need to follow steps like this -
http://people.apache.org/~__thejas/templeton_doc_latest/__installation.html#Secure+__Cluster
<http://people.apache.org/~thejas/templeton_doc_latest/installation.html#Secure+Cluster>
ie, Add hadoop.proxyuser.USER.groups and
hadoop.proxyuser.USER.hosts config params (replacing USER with
user oozie runs as) to hive-site.xml .
Thanks,
Thejas
On 7/6/12 9:08 PM, Rajesh Balamohan wrote:
Hi Tejas,
I have security enabled (kerberos) hadoop cluster 0.20.205x
with Pig
0.9.3 and Hcatalog 0.4.1
When I try to run HCatalog with PIG in standalone grunt, it
works great.
However, when I embed the same PIG script in oozie, it
throws GSS
transport exception like the one mentioned in HCATALOG-366.
Does HCatalog 0.4.1 work with oozie in secured mode? Are
there any
additional delegation token which is missing causing this
error?. It
prints GSS API error in the client as well as in the server
side.
Any pointers would be great help.
2012-07-06 20:58:35,363 DEBUG
org.apache.thrift.transport.__TSaslTransport: opening transport
org.apache.thrift.transport.__TSaslClientTransport@2b3d9460
2012-07-06 20:58:35,364 DEBUG
org.apache.thrift.transport.__TSaslTransport: CLIENT:
Writing message with
status BAD and payload length 19
2012-07-06 20:58:35,364 WARN hive.metastore: Failed to
connect to the
MetaStore Server...
org.apache.thrift.transport.__TTransportException: GSS
initiate failed
at
org.apache.thrift.transport.__TSaslTransport.__sendAndThrowMessage(__TSaslTransport.java:221)
at
org.apache.thrift.transport.__TSaslTransport.open(__TSaslTransport.java:296)
at
org.apache.thrift.transport.__TSaslClientTransport.open(__TSaslClientTransport.java:37)
at
org.apache.hadoop.hive.thrift.__client.TUGIAssumingTransport$__1.run(TUGIAssumingTransport.__java:52)
at
org.apache.hadoop.hive.thrift.__client.TUGIAssumingTransport$__1.run(TUGIAssumingTransport.__java:49)
at java.security.__AccessController.doPrivileged(__Native
Method)
at javax.security.auth.Subject.__doAs(Subject.java:396)
at
org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1127)
at
org.apache.hadoop.hive.thrift.__client.TUGIAssumingTransport.__open(TUGIAssumingTransport.__java:49)
at
org.apache.hadoop.hive.__metastore.HiveMetaStoreClient.__openStore(HiveMetaStoreClient.__java:263)
at
org.apache.hadoop.hive.__metastore.HiveMetaStoreClient.__open(HiveMetaStoreClient.java:__195)
at
org.apache.hadoop.hive.__metastore.HiveMetaStoreClient.__<init>(HiveMetaStoreClient.__java:157)
at
org.apache.oozie.action.__hadoop.IntuitPigMain.__runPigJob(IntuitPigMain.java:__120)
at
org.apache.oozie.action.__hadoop.PigMain.run(PigMain.__java:206)
at
org.apache.oozie.action.__hadoop.LauncherMain.run(__LauncherMain.java:26)
at
org.apache.oozie.action.__hadoop.IntuitPigMain.main(__IntuitPigMain.java:61)
at sun.reflect.__NativeMethodAccessorImpl.__invoke0(Native
Method)
at
sun.reflect.__NativeMethodAccessorImpl.__invoke(__NativeMethodAccessorImpl.java:__39)
at
sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__DelegatingMethodAccessorImpl.__java:25)
at java.lang.reflect.Method.__invoke(Method.java:597)
at
org.apache.oozie.action.__hadoop.LauncherMapper.map(__LauncherMapper.java:391)
at
org.apache.hadoop.mapred.__MapRunner.run(MapRunner.java:__50)
at
org.apache.hadoop.mapred.__MapTask.runOldMapper(MapTask.__java:391)
at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.__Child$4.run(Child.java:270)
at java.security.__AccessController.doPrivileged(__Native
Method)
at javax.security.auth.Subject.__doAs(Subject.java:396)
at
org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1127)
at org.apache.hadoop.mapred.__Child.main(Child.java:264)
2012-07-06 20:58:35,364 INFO hive.metastore: Waiting 1
seconds before
next connection attempt.
2012-07-06 20:58:36,365 DEBUG
org.apache.thrift.transport.__TSaslTransport: opening transport
org.apache.thrift.transport.__TSaslClientTransport@57d840cd
2012-07-06 20:58:36,366 DEBUG
org.apache.thrift.transport.__TSaslTransport: CLIENT:
Writing message with
status BAD and payload length 19
2012-07-06 20:58:36,366 WARN hive.metastore: Failed to
connect to the
MetaStore Server...
org.apache.thrift.transport.__TTransportException: GSS
initiate failed
at
org.apache.thrift.transport.__TSaslTransport.__sendAndThrowMessage(__TSaslTransport.java:221)
at
org.apache.thrift.transport.__TSaslTransport.open(__TSaslTransport.java:296)
at
org.apache.thrift.transport.__TSaslClientTransport.open(__TSaslClientTransport.java:37)
at
org.apache.hadoop.hive.thrift.__client.TUGIAssumingTransport$__1.run(TUGIAssumingTransport.__java:52)
at
org.apache.hadoop.hive.thrift.__client.TUGIAssumingTransport$__1.run(TUGIAssumingTransport.__java:49)
at java.security.__AccessController.doPrivileged(__Native
Method)
at javax.security.auth.Subject.__doAs(Subject.java:396)
at
org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1127)
at
org.apache.hadoop.hive.thrift.__client.TUGIAssumingTransport.__open(TUGIAssumingTransport.__java:49)
at
org.apache.hadoop.hive.__metastore.HiveMetaStoreClient.__openStore(HiveMetaStoreClient.__java:263)
at
org.apache.hadoop.hive.__metastore.HiveMetaStoreClient.__open(HiveMetaStoreClient.java:__195)
at
org.apache.hadoop.hive.__metastore.HiveMetaStoreClient.__<init>(HiveMetaStoreClient.__java:157)
at
org.apache.oozie.action.__hadoop.IntuitPigMain.__runPigJob(IntuitPigMain.java:__120)
at
org.apache.oozie.action.__hadoop.PigMain.run(PigMain.__java:206)
at
org.apache.oozie.action.__hadoop.LauncherMain.run(__LauncherMain.java:26)
at
org.apache.oozie.action.__hadoop.IntuitPigMain.main(__IntuitPigMain.java:61)
at sun.reflect.__NativeMethodAccessorImpl.__invoke0(Native
Method)
at
sun.reflect.__NativeMethodAccessorImpl.__invoke(__NativeMethodAccessorImpl.java:__39)
at
sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__DelegatingMethodAccessorImpl.__java:25)
at java.lang.reflect.Method.__invoke(Method.java:597)
at
org.apache.oozie.action.__hadoop.LauncherMapper.map(__LauncherMapper.java:391)
at
org.apache.hadoop.mapred.__MapRunner.run(MapRunner.java:__50)
at
org.apache.hadoop.mapred.__MapTask.runOldMapper(MapTask.__java:391)
at org.apache.hadoop.mapred.__MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.__Child$4.run(Child.java:270)
at java.security.__AccessController.doPrivileged(__Native
Method)
at javax.security.auth.Subject.__doAs(Subject.java:396)
at
org.apache.hadoop.security.__UserGroupInformation.doAs(__UserGroupInformation.java:__1127)
at org.apache.hadoop.mapred.__Child.main(Child.java:264)
--
~Rajesh.B
--
~Rajesh.B
--
~Rajesh.B