Manish Malhotra created HIVE-9469:
-------------------------------------

             Summary: Hive Thrift Server throws Socket Timeout Exception: Read 
time out
                 Key: HIVE-9469
                 URL: https://issues.apache.org/jira/browse/HIVE-9469
             Project: Hive
          Issue Type: Bug
          Components: Metastore
    Affects Versions: 0.10.0
         Environment: 4 core cpu, 15gb memory. 2 thrift server behind load 
balancer
            Reporter: Manish Malhotra


Hi All,

Please review the following problem, I also posted same in the hive-user group, 
but didnt got any response yet. 
This is happening quite frequently in our environment. 
So, it would be great if somebody can see and advise. 

I'm using Hive Thrift Server in Production which at peak handles around 500 
req/min.
After certain point the Hive Thrift Server is going into the no response mode 
and throws 
Following exception 
"org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out" 

As the metastore we are using MySQL, that is being used by Thrift server. 
The design / architecture is like this: 

Oozie -- > Hive Action --> ELB (AWS) --> Hive Thrift ( 2 servers) --> MySQL 
(Master) -- > MySQL (Slave).

Software versions: 

   Hive version : 0.10.0
   Hadoop: 1.2.1


Looks like when the load is beyond some threshold for certain operations it is 
having problem in responding. 
As the hive jobs sometimes fails because of this issue, we also have a 
auto-restart check to see if the Thrift server is not responding, it stops / 
kills and restart the service. 

Other tuning done: 

Thrift Server: 

Given 11gb heap, and configured CMS GC algo. 

MySQL: 

Tuned innodb_buffer, tmp_table and max_heap parameters.

So, can somebody please help to understand, what could be the root cause for 
this or somebody faced the similar issue. 

I found one related JIRA :https://issues.apache.org/jira/browse/HCATALOG-541

But this JIRA shows that Hive Thrift Server shows OOM error, but in my case I 
didnt see any OOM error in my case.


Regards,
Manish

Full Exception Stack: 

    at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
    at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
    at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
    at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:412)
    at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:399)
    at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:736)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
    at $Proxy7.getDatabase(Unknown Source)
    at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1110)
    at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
    at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
    at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
    at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:150)
    at java.net.SocketInputStream.read(SocketInputStream.java:121)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
    ... 34 more
2015-01-20 22:44:12,978 ERROR exec.Task (SessionState.java:printError(401)) - 
FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out
    at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1114)
    at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
    at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
    at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
    at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to