Error communicating with the metastore | org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset
Hi, everyone. We're getting this error on Hive 3.1.0: org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the metastore at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714) at org.apache.hadoop.hive.ql.Driver.recordValidWriteIds(Driver.java:1463) at org.apache.hadoop.hive.ql.Driver.acquireLocks(Driver.java:1653) at org.apache.hadoop.hive.ql.Driver.lockAndRespond(Driver.java:1832) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2002) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1746) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1740) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:318) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:331) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147) at org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:202) at org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest$GetValidWriteIdsRequestStandardScheme.write(GetValidWriteIdsRequest.java:489) at org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest$GetValidWriteIdsRequestStandardScheme.write(GetValidWriteIdsRequest.java:424) at org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.write(GetValidWriteIdsRequest.java:362) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getValidWriteIds(HiveMetaStoreClient.java:2589) at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) at com.sun.proxy.$Proxy37.getValidWriteIds(Unknown Source) at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2934) at com.sun.proxy.$Proxy37.getValidWriteIds(Unknown Source) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:712) ... 21 more Caused by: java.net.SocketException: Connection reset at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) at java.net.SocketOutputStream.write(SocketOutputStream.java:153) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145) ... 44 more It looks like while the thrift client is writing the request in the OutputStream, the server resets the connection. Is there any workaround to this? Or has anyone else experienced this? This only started showing up recently. Thank you in advance. Regards, Bernard
Re: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large.
Hi. We fixed the issue by patching protobuf-java-2.5.0.jar, we changed CodedInputStream.DEFAULT_SIZE_LIMIT to 1GB. Uploaded the patched version on our servers and added the location of the aforementioned jar to the *tez.cluster.additional.classpath.prefix* (tez-site.xml) to /path/to/patched/protobuf-java-2.5.0.jar:. Please note that it should be the first jar on the *tez.cluster.additional.classpath.prefix*. Apparently, Tez was using the default 64MB protobuf message limit. BTW, latest protobuf version was set to Integer.MAX_VALUE. See https://github.com/protocolbuffers/protobuf/blob/v3.11.3/java/core/src/main/java/com/google/protobuf/CodedInputStream.java#L62-L65 . Regards, Bernard On Mon, Feb 10, 2020 at 8:23 PM Bernard Quizon < bernard.qui...@cheetahdigital.com> wrote: > Hi. > > We're using Hive 3.0.1 and we're currently experiencing this issue: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Error while processing statement: FAILED: Execution Error, return code 2 > from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, > vertexName=Map 1, vertexId=vertex_1581309524541_0094_14_00, > diagnostics=[Vertex vertex_1581309524541_0094_14_00 [Map 1] killed/failed > due to:INIT_FAILURE, Fail to create InputInitializerManager, > org.apache.tez.dag.api.TezReflectionException: Unable to instantiate class > with 1 arguments: org.apache.hadoop.hive.ql.exec.tez.HiveSplitGeneratorat > org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:71)at > org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:89)at > org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:152)at > org.apache.tez.dag.app.dag.RootInputInitializerManager$1.run(RootInputInitializerManager.java:148)at > java.security.AccessController.doPrivileged(Native Method)at > javax.security.auth.Subject.doAs(Subject.java:422)at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)at > org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:148)at > org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:121)at > org.apache.tez.dag.app.dag.impl.VertexImpl.setupInputInitializerManager(VertexImpl.java:4122)at > org.apache.tez.dag.app.dag.impl.VertexImpl.access$3100(VertexImpl.java:207)at > org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:2932)at > org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2879)at > org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2861)at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)at > org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:59)at > org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1957)at > org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:206)at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2317)at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2303)at > org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:180)at > org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115)at > java.lang.Thread.run(Thread.java:745)Caused by: > java.lang.reflect.InvocationTargetExceptionat > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)at > java.lang.reflect.Constructor.newInstance(Constructor.java:423)at > org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:68)... > 25 moreCaused by: com.google.protobuf.InvalidProtocolBufferException: > Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit.at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)at > com.google.pr