Re: Sqoop hive import with "as-parquetfile" failed in Kerberos enabled cluster

suraj shrestha Fri, 25 Mar 2016 08:12:12 -0700

 

    On Friday, March 25, 2016 1:11 AM, Ping PW Wang <[email protected]> wrote:


 Hi Jordan, 
Thanks a lot for your comment. My hive configuration has no problem -it can 
work well in Kerberos. I found Parquet support via SQOOP-1390 used Kite SDK. By 
the stack trace, kitesdk.data.spi.hive.MetaStoreUtil called 
HiveMetaStoreClient. There's a Kite bug KITE-1014 for "Fix support for Hive 
datasets on Kerberos enabled clusters." on version 1.1.0. Sqoop 1.4.6 is using 
Kite 1.0.0 without this fix. I made the changes below:
1) Upgrade Kite to latest version for Sqoop dependency 
2) From Sqoop side, add the hive configuration and send it to Kite After the 
two fix, the error is gone. But a new problem occurred. Seems the Kerberos 
Support on such usage still has some problem. I opened a Sqoop JIRA 
https://issues.apache.org/jira/browse/SQOOP-2894.

Jordan Birdsell ---03/22/2016 08:07:25 PM---I don't believe this is a Kerberos 
issue. Is your hive-site.xml configured properly for the hive.me

From: Jordan Birdsell <[email protected]>
To: "[email protected]" <[email protected]>, "[email protected]" 
<[email protected]>
Date: 03/22/2016 08:07 PM
Subject: RE: Sqoop hive import with "as-parquetfile" failed in Kerberos enabled 
cluster



I don’t believe this is a Kerberos issue. Is your hive-site.xml configured 
properly for the hive.metastore.uris?
 
From: Ping PW Wang [mailto:[email protected]] 
Sent: Tuesday, March 22, 2016 6:12 AM
To: [email protected]; [email protected]
Subject: Sqoop hive import with "as-parquetfile" failed in Kerberos enabled 
cluster
 Hi,
I'm trying to import data from external database to hive with Parquet option. 
But failed in the kerberos environment. It can success without kerberos. Since 
Parquet is new added on the 1.4.6, is there any limitation on the usage with 
security? Please advise, thanks a lot! 

The sqoop command I used:
sqoop import --connect jdbc:db2://xxx:50000/testdb --username xxx --password 
xxx --table users --hive-import -hive-table users3 --as-parquetfile -m 1
......
16/03/20 23:09:08 DEBUG db.DataDrivenDBInputFormat: Creating input split with 
lower bound '1=1' and upper bound '1=1'
16/03/20 23:09:08 INFO mapreduce.JobSubmitter: number of splits:1
16/03/20 23:09:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
job_1458536022132_0002
16/03/20 23:09:09 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, 
Service: 9.30.151.107:8020, Ident: (HDFS_DELEGATION_TOKEN token 16 for 
ambari-qa)
16/03/20 23:09:10 INFO impl.YarnClientImpl: Submitted application 
application_1458536022132_0002
16/03/20 23:09:10 INFO mapreduce.Job: The url to track the job: 
http://xxx:8088/proxy/application_1458536022132_0002/
16/03/20 23:09:10 INFO mapreduce.Job: Running job: job_1458536022132_0002
16/03/20 23:09:10 INFO mapreduce.Job: Running job: job_1458536022132_0002
16/03/20 23:33:42 INFO mapreduce.Job: Job job_1458536022132_0002 running in 
uber mode : false
16/03/20 23:33:42 INFO mapreduce.Job: map 0% reduce 0%
16/03/20 23:33:42 INFO mapreduce.Job: Job job_1458536022132_0002 failed with 
state FAILED due to: Application application_1458536022132_0002 failed 2 times 
due to ApplicationMaster for attempt appattempt_1458536022132_0002_000002 timed 
out. Failing the application.
16/03/20 23:33:42 INFO mapreduce.Job: Counters: 0
16/03/20 23:33:42 WARN mapreduce.Counters: Group FileSystemCounters is 
deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
16/03/20 23:33:42 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 
1,496.0323 seconds (0 bytes/sec)
16/03/20 23:33:42 WARN mapreduce.Counters: Group 
org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
org.apache.hadoop.mapreduce.TaskCounter instead
16/03/20 23:33:42 INFO mapreduce.ImportJobBase: Retrieved 0 records.
16/03/20 23:33:42 DEBUG util.ClassLoaderStack: Restoring classloader: 
sun.misc.Launcher$AppClassLoader@67205a84
16/03/20 23:33:42 ERROR tool.ImportTool: Error during import: Import job failed!


Here's the Job Log:
......
2016-02-26 04:20:07,020 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
2016-02-26 04:20:08,088 INFO [main] 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config 
null
2016-02-26 04:20:08,918 INFO [main] hive.metastore: Trying to connect to 
metastore with URI thrift://xxx:9083
2016-02-26 04:30:09,207 WARN [main] hive.metastore: set_ugi() not successful, 
Likely cause: new client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out
 at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
 at 
org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:380)
 at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:3688)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:3674)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:448)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:237)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:182)
 at org.kitesdk.data.spi.hive.MetaStoreUtil.<init>(MetaStoreUtil.java:82)
 at 
org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.getMetaStoreUtil(HiveAbstractMetadataProvider.java:63)
 at 
org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:270)
 at 
org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:255)
 at 
org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.load(HiveAbstractMetadataProvider.java:102)
 at 
org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:192)
 at org.kitesdk.data.Datasets.load(Datasets.java:108)
 at org.kitesdk.data.Datasets.load(Datasets.java:165)
 at 
org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.load(DatasetKeyOutputFormat.java:510)
 at 
org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.getOutputCommitter(DatasetKeyOutputFormat.java:473)
 at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:476)
 at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:458)
 at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1560)
 at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:458)
 at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:377)
 at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1518)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
 at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448)
.......

Re: Sqoop hive import with "as-parquetfile" failed in Kerberos enabled cluster

Reply via email to