Looking at the hive metastore server logs see errors like these: 2013-07-26 06:34:52,853 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run(182)) - Error occurred during processing of message. java.lang.NullPointerException at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.setIpAddress(TUGIBasedProcessor.java:183) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:79) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
approx same time as we see timeout or connection reset errors. Dont know if this is the cause or the side affect of he connection timeout/connection reset errors. Does anybody have any pointers or suggestions ? Thanks On Mon, Jul 29, 2013 at 11:29 AM, agateaaa <agate...@gmail.com> wrote: > Thanks Nitin! > > We have simiar setup (identical hcatalog and hive server versions) on a > another production environment and dont see any errors (its been running ok > for a few months) > > Unfortunately we wont be able to move to hcat 0.5 and hive 0.11 or hive > 0.10 soon. > > I did see that the last time we ran into this problem doing a netstat-ntp > | grep ":10000" see that server was holding on to one socket connection in > CLOSE_WAIT state for a long time > (hive metastore server is running on port 10000). Dont know if thats > relevant here or not > > Can you suggest any hive configuration settings we can tweak or networking > tools/tips, we can use to narrow this down ? > > Thanks > Agateaaa > > > > > On Mon, Jul 29, 2013 at 11:02 AM, Nitin Pawar <nitinpawar...@gmail.com>wrote: > >> Is there any chance you can do a update on test environment with hcat-0.5 >> and hive-0(11 or 10) and see if you can reproduce the issue? >> >> We used to see this error when there was load on hcat server or some >> network issue connecting to the server(second one was rare occurrence) >> >> >> On Mon, Jul 29, 2013 at 11:13 PM, agateaaa <agate...@gmail.com> wrote: >> >>> Hi All: >>> >>> We are running into frequent problem using HCatalog 0.4.1 (HIve Metastore >>> Server 0.9) where we get connection reset or connection timeout errors. >>> >>> The hive metastore server has been allocated enough (12G) memory. >>> >>> This is a critical problem for us and would appreciate if anyone has any >>> pointers. >>> >>> We did add a retry logic in our client, which seems to help, but I am >>> just >>> wondering how can we narrow down to the root cause >>> of this problem. Could this be a hiccup in networking which causes the >>> hive >>> server to get into a unresponsive state ? >>> >>> Thanks >>> >>> Agateaaa >>> >>> >>> Example Connection reset error: >>> ======================= >>> >>> org.apache.thrift.transport.TTransportException: >>> java.net.SocketException: >>> Connection reset >>> at >>> >>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) >>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >>> at >>> >>> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) >>> at >>> >>> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) >>> at >>> >>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) >>> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) >>> at >>> >>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:2136) >>> at >>> >>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:2122) >>> at >>> >>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.openStore(HiveMetaStoreClient.java:286) >>> at >>> >>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:197) >>> at >>> >>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:157) >>> at >>> >>> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2092) >>> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2102) >>> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:888) >>> at >>> >>> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableAddParts(DDLSemanticAnalyzer.java:1817) >>> at >>> >>> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:297) >>> at >>> >>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) >>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) >>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) >>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:909) >>> at >>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) >>> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) >>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) >>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341) >>> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642) >>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> at >>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>> Caused by: java.net.SocketException: Connection reset >>> at java.net.SocketInputStream.read(SocketInputStream.java:168) >>> at >>> >>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) >>> ... 30 more >>> >>> >>> >>> >>> Example Connection timeout error: >>> ========================== >>> >>> org.apache.thrift.transport.TTransportException: >>> java.net.SocketTimeoutException: Read timed out >>> at >>> >>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) >>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >>> at >>> >>> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) >>> at >>> >>> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) >>> at >>> >>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) >>> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) >>> at >>> >>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:2136) >>> at >>> >>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:2122) >>> at >>> >>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.openStore(HiveMetaStoreClient.java:286) >>> at >>> >>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:197) >>> at >>> >>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:157) >>> at >>> >>> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2092) >>> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2102) >>> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:888) >>> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:830) >>> at >>> >>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:954) >>> at >>> >>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7524) >>> at >>> >>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) >>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) >>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) >>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:909) >>> at >>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) >>> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) >>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) >>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341) >>> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642) >>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> at >>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>> Caused by: java.net.SocketTimeoutException: Read timed out >>> at java.net.SocketInputStream.socketRead0(Native Method) >>> at java.net.SocketInputStream.read(SocketInputStream.java:129) >>> at >>> >>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) >>> ... 31 more >>> >> >> >> >> -- >> Nitin Pawar >> > >