[
https://issues.apache.org/jira/browse/HIVE-9469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315761#comment-14315761
]
Manish Malhotra commented on HIVE-9469:
---------------------------------------
Thanks [~vgumashta].
But does the old version had this issue, as I didnt see issue like this, apart
from https://issues.apache.org/jira/browse/HCATALOG-541 where there were OOM.
We are in the process of upgrading the Hive to 12.
Meanwhile the steps I have taken for better performance and to avoid this
problem are:
1. Database connection pooling tuning,
Default is 10, made it 30 on each thrift server.
Though the the DBCP Connection Pool ( maximum connections) config also need to
be think though as that will also have implication of using MySQL resources.
2. JVM GC Tuning
3. keeping number of partitions in tact
Do you have any other suggestion for production deployment.
Plus I have another question, that is, Thrift Server uses DataNucleus framework
which is Open-Source Persistence product, and internally uses JDO.
DataNucleus doesnt support all the configs for DBCP connection pooling,
So, should either Thrift can use another ORM tool or provide more hooks for the
DBCP support.
Regards,
Manish
> Hive Thrift Server throws Socket Timeout Exception: Read time out
> -----------------------------------------------------------------
>
> Key: HIVE-9469
> URL: https://issues.apache.org/jira/browse/HIVE-9469
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Affects Versions: 0.10.0
> Environment: 4 core cpu, 15gb memory. 2 thrift server behind load
> balancer
> Reporter: Manish Malhotra
> Attachments: After_JMV_Profiling_Tuning.jpg,
> Before_JMV_Profiling_Tuning.jpg
>
>
> Hi All,
> Please review the following problem, I also posted same in the hive-user
> group, but didnt got any response yet.
> This is happening quite frequently in our environment.
> So, it would be great if somebody can see and advise.
> I'm using Hive Thrift Server in Production which at peak handles around 500
> req/min.
> After certain point the Hive Thrift Server is going into the no response mode
> and throws
> Following exception
> "org.apache.hadoop.hive.ql.metadata.HiveException:
> org.apache.thrift.transport.TTransportException:
> java.net.SocketTimeoutException: Read timed out"
> As the metastore we are using MySQL, that is being used by Thrift server.
> The design / architecture is like this:
> Oozie -- > Hive Action --> ELB (AWS) --> Hive Thrift ( 2 servers) --> MySQL
> (Master) -- > MySQL (Slave).
> Software versions:
> Hive version : 0.10.0
> Hadoop: 1.2.1
> Looks like when the load is beyond some threshold for certain operations it
> is having problem in responding.
> As the hive jobs sometimes fails because of this issue, we also have a
> auto-restart check to see if the Thrift server is not responding, it stops /
> kills and restart the service.
> Other tuning done:
> Thrift Server:
> Given 11gb heap, and configured CMS GC algo.
> MySQL:
> Tuned innodb_buffer, tmp_table and max_heap parameters.
> So, can somebody please help to understand, what could be the root cause for
> this or somebody faced the similar issue.
> I found one related JIRA :https://issues.apache.org/jira/browse/HCATALOG-541
> But this JIRA shows that Hive Thrift Server shows OOM error, but in my case I
> didnt see any OOM error in my case.
> Regards,
> Manish
> Full Exception Stack:
> at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:412)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:399)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:736)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
> at $Proxy7.getDatabase(Unknown Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1110)
> at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
> at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
> at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> Caused by: java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:150)
> at java.net.SocketInputStream.read(SocketInputStream.java:121)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
> at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> ... 34 more
> 2015-01-20 22:44:12,978 ERROR exec.Task (SessionState.java:printError(401)) -
> FAILED: Error in metadata: org.apache.thrift.transport.TTransportException:
> java.net.SocketTimeoutException: Read timed out
> org.apache.hadoop.hive.ql.metadata.HiveException:
> org.apache.thrift.transport.TTransportException:
> java.net.SocketTimeoutException: Read timed out
> at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1114)
> at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
> at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
> at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)