On Friday, March 25, 2016 1:11 AM, Ping PW Wang <[email protected]> wrote:
Hi Jordan,
Thanks a lot for your comment. My hive configuration has no problem -it can
work well in Kerberos. I found Parquet support via SQOOP-1390 used Kite SDK. By
the stack trace, kitesdk.data.spi.hive.MetaStoreUtil called
HiveMetaStoreClient. There's a Kite bug KITE-1014 for "Fix support for Hive
datasets on Kerberos enabled clusters." on version 1.1.0. Sqoop 1.4.6 is using
Kite 1.0.0 without this fix. I made the changes below:
1) Upgrade Kite to latest version for Sqoop dependency
2) From Sqoop side, add the hive configuration and send it to Kite After the
two fix, the error is gone. But a new problem occurred. Seems the Kerberos
Support on such usage still has some problem. I opened a Sqoop JIRA
https://issues.apache.org/jira/browse/SQOOP-2894.
Jordan Birdsell ---03/22/2016 08:07:25 PM---I don't believe this is a Kerberos
issue. Is your hive-site.xml configured properly for the hive.me
From: Jordan Birdsell <[email protected]>
To: "[email protected]" <[email protected]>, "[email protected]"
<[email protected]>
Date: 03/22/2016 08:07 PM
Subject: RE: Sqoop hive import with "as-parquetfile" failed in Kerberos enabled
cluster
I don’t believe this is a Kerberos issue. Is your hive-site.xml configured
properly for the hive.metastore.uris?
From: Ping PW Wang [mailto:[email protected]]
Sent: Tuesday, March 22, 2016 6:12 AM
To: [email protected]; [email protected]
Subject: Sqoop hive import with "as-parquetfile" failed in Kerberos enabled
cluster
Hi,
I'm trying to import data from external database to hive with Parquet option.
But failed in the kerberos environment. It can success without kerberos. Since
Parquet is new added on the 1.4.6, is there any limitation on the usage with
security? Please advise, thanks a lot!
The sqoop command I used:
sqoop import --connect jdbc:db2://xxx:50000/testdb --username xxx --password
xxx --table users --hive-import -hive-table users3 --as-parquetfile -m 1
......
16/03/20 23:09:08 DEBUG db.DataDrivenDBInputFormat: Creating input split with
lower bound '1=1' and upper bound '1=1'
16/03/20 23:09:08 INFO mapreduce.JobSubmitter: number of splits:1
16/03/20 23:09:09 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_1458536022132_0002
16/03/20 23:09:09 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN,
Service: 9.30.151.107:8020, Ident: (HDFS_DELEGATION_TOKEN token 16 for
ambari-qa)
16/03/20 23:09:10 INFO impl.YarnClientImpl: Submitted application
application_1458536022132_0002
16/03/20 23:09:10 INFO mapreduce.Job: The url to track the job:
http://xxx:8088/proxy/application_1458536022132_0002/
16/03/20 23:09:10 INFO mapreduce.Job: Running job: job_1458536022132_0002
16/03/20 23:09:10 INFO mapreduce.Job: Running job: job_1458536022132_0002
16/03/20 23:33:42 INFO mapreduce.Job: Job job_1458536022132_0002 running in
uber mode : false
16/03/20 23:33:42 INFO mapreduce.Job: map 0% reduce 0%
16/03/20 23:33:42 INFO mapreduce.Job: Job job_1458536022132_0002 failed with
state FAILED due to: Application application_1458536022132_0002 failed 2 times
due to ApplicationMaster for attempt appattempt_1458536022132_0002_000002 timed
out. Failing the application.
16/03/20 23:33:42 INFO mapreduce.Job: Counters: 0
16/03/20 23:33:42 WARN mapreduce.Counters: Group FileSystemCounters is
deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
16/03/20 23:33:42 INFO mapreduce.ImportJobBase: Transferred 0 bytes in
1,496.0323 seconds (0 bytes/sec)
16/03/20 23:33:42 WARN mapreduce.Counters: Group
org.apache.hadoop.mapred.Task$Counter is deprecated. Use
org.apache.hadoop.mapreduce.TaskCounter instead
16/03/20 23:33:42 INFO mapreduce.ImportJobBase: Retrieved 0 records.
16/03/20 23:33:42 DEBUG util.ClassLoaderStack: Restoring classloader:
sun.misc.Launcher$AppClassLoader@67205a84
16/03/20 23:33:42 ERROR tool.ImportTool: Error during import: Import job failed!
Here's the Job Log:
......
2016-02-26 04:20:07,020 INFO [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
2016-02-26 04:20:08,088 INFO [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config
null
2016-02-26 04:20:08,918 INFO [main] hive.metastore: Trying to connect to
metastore with URI thrift://xxx:9083
2016-02-26 04:30:09,207 WARN [main] hive.metastore: set_ugi() not successful,
Likely cause: new client talking to old server. Continuing without it.
org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at
org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:380)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:3688)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:3674)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:448)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:237)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:182)
at org.kitesdk.data.spi.hive.MetaStoreUtil.<init>(MetaStoreUtil.java:82)
at
org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.getMetaStoreUtil(HiveAbstractMetadataProvider.java:63)
at
org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:270)
at
org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:255)
at
org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.load(HiveAbstractMetadataProvider.java:102)
at
org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:192)
at org.kitesdk.data.Datasets.load(Datasets.java:108)
at org.kitesdk.data.Datasets.load(Datasets.java:165)
at
org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.load(DatasetKeyOutputFormat.java:510)
at
org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.getOutputCommitter(DatasetKeyOutputFormat.java:473)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:476)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:458)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1560)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:458)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:377)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1518)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448)
.......