[
https://issues.apache.org/jira/browse/HCATALOG-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285924#comment-14285924
]
Manish Malhotra commented on HCATALOG-541:
------------------------------------------
Hi Travis and Arup,
I'm also facing similar problem while using Hive Thrift Server but without
HCatalog.
But I didnt see OOM error in the thrift server logs.
Pattern is mostly when the load on the Hive thrift server is high ( mostly when
most of the Hive ETL jobs are running) some time it start getting into the mode
where it doesnt respond in time and throws Socket Timeout.
And this happens for different operations and not only for list partitions.
Please update, if there is any update on this ticket, that might help my
situation as well.
Regards,
Manish
Stack Trace:
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:412)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:399)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:736)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
at $Proxy7.getDatabase(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1110)
at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 34 more
2015-01-20 22:44:12,978 ERROR exec.Task (SessionState.java:printError(401)) -
FAILED: Error in metadata: org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1114)
at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> The meta store client throws TimeOut exception if ~1000 clients are trying to
> call listPartition on the server
> --------------------------------------------------------------------------------------------------------------
>
> Key: HCATALOG-541
> URL: https://issues.apache.org/jira/browse/HCATALOG-541
> Project: HCatalog
> Issue Type: Improvement
> Environment: Hadoop 0.23.4
> Hcatalog 0.4
> Oracle
> Reporter: Arup Malakar
>
> Error on the client:
> {code}
> 2012-10-24 21:44:03,942 INFO [pool-12-thread-2]
> org.apache.hcatalog.hcatmix.load.tasks.Task: Error listing partitions
> org.apache.thrift.transport.TTransportException:
> java.net.SocketTimeoutException: Read timed out
> at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at
> org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:345)
> at
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:422)
> at
> org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:404)
> at
> org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at
> org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
> at
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions(ThriftHiveMetastore.java:1208)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions(ThriftHiveMetastore.java:1193)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitions(HiveMetaStoreClient.java:631)
> at
> org.apache.hcatalog.hcatmix.load.tasks.HCatListPartitionTask.doTask(HCatListPartitionTask.java:45)
> at
> org.apache.hcatalog.hcatmix.load.TaskExecutor.call(TaskExecutor.java:79)
> at
> org.apache.hcatalog.hcatmix.load.TaskExecutor.call(TaskExecutor.java:39)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> {code}
> Error on the server:
> {code}
> Exception in thread "pool-1-thread-3206" java.lang.OutOfMemoryError: unable
> to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:597)
> at
> org.datanucleus.store.query.Query.performExecuteTask(Query.java:1891)
> at
> org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:613)
> at org.datanucleus.store.query.Query.executeQuery(Query.java:1692)
> at org.datanucleus.store.query.Query.executeWithArray(Query.java:1527)
> at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:266)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.listMPartitions(ObjectStore.java:1521)
> at
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1268)
> at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
> at $Proxy7.getPartitions(Unknown Source)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:1468)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:5318)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:5306)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> at
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:555)
> at
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:552)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
> at
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:552)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run_aroundBody0(TThreadPoolServer.java:176)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run_aroundBody1$advice(TThreadPoolServer.java:101)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:1)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> {code}
> The graph for concurrent usage of list partition can be seen here:
> https://cwiki.apache.org/confluence/download/attachments/30740331/hcatmix_list_partition_loadtest_25min.html
> The table has 2000 partitions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)