[ https://issues.apache.org/jira/browse/HDDS-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17215219#comment-17215219 ]
Glen Geng commented on HDDS-4351: --------------------------------- Hello [~erose] [~arp] [~bharat] As requested by Ethan, I scheduled a long run test on latest master with HDDS-4327, good news is there is no DN crash during the whole test. De facto, the try-with-resource on BatchOperation fixed the crash in RocksDB. I suggest to close this Jira and re-open it if we see similar crash again in future. > DN crash while RatisApplyTransactionExecutor tries to putBlock to rocksDB > ------------------------------------------------------------------------- > > Key: HDDS-4351 > URL: https://issues.apache.org/jira/browse/HDDS-4351 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode > Affects Versions: 1.1.0 > Reporter: Glen Geng > Assignee: Ethan Rose > Priority: Major > > In Tencent, we monthly pick up the latest mater and deploy them to our > production environment. > This time, we tested c956ce6 (HDDS-4262 [. Use ClientID and CallID from Rpc > Client to detect retry > re…|https://github.com/apache/hadoop-ozone/commit/c956ce6b7537a0286c01b15d4963333a7ffeba90] > ), encountered frequently crash in datanode while putBlock. > > *The setup* is 3 DN, each engage in 8 pipelines. 1 OM 1 SCM and 1 Gateway. > *The repo procedure* is simple. Continually writing 10GB size files to s3g > from python (the aws lib boto3), after write tens of files, DN might crash > while applying putBlock operations. > After running the test on the version that revert HDDS-3869 for 10 hours,, > no DN crash occurred. > Will schedule a long run test on the latest master with HDDS-4327, to check > if adding try-with-resource to BatchOperation could fix the crash issue. > > *Example1: segment fault while putBlock.* > {code:java} > Current thread (0x00007eff34524000): JavaThread > "RatisApplyTransactionExecutor 9" daemon [_thread_in_native, id=20401, > stack(0x00007efef4a14000,0x00007efef4b15000)]siginfo: si_signo: 11 (SIGSEGV), > si_code: 2 (SEGV_ACCERR), si_addr: 0x00007eff37eb9000Registers: > RAX=0x00007efe8bbfb024, RBX=0x0000000000000000, RCX=0x0000000000000000, > RDX=0x00000000007688e4 > RSP=0x00007efef4b11e38, RBP=0x00007efef4b11f60, RSI=0x00007eff37eb8feb, > RDI=0x00007efe8f892640 > R8 =0x00007efe8bbfb024, R9 =0x0000000000800000, R10=0x0000000000000022, > R11=0x0000000000001000 > R12=0x00007efef4b12100, R13=0x00007eff340badc0, R14=0x00007eff340bb7b0, > R15=0x0000000004400000 > RIP=0x00007eff4fa04bae, EFLAGS=0x0000000000010206, CSGSFS=0x0000000000000033, > ERR=0x0000000000000004 > TRAPNO=0x000000000000000eStack: [0x00007efef4a14000,0x00007efef4b15000], > sp=0x00007efef4b11e38, free space=1015k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > C [libc.so.6+0x151bae] __memmove_ssse3_back+0x192e > C [librocksdbjni3701435679326554484.so+0x3b2263] > rocksdb::MemTableInserter::DeleteCF(unsigned int, rocksdb::Slice const&)+0x253 > C [librocksdbjni3701435679326554484.so+0x3a889f] > rocksdb::WriteBatchInternal::Iterate(rocksdb::WriteBatch const*, > rocksdb::WriteBatch::Handler*, unsigned long, unsigned long)+0x75f > C [librocksdbjni3701435679326554484.so+0x3a8d44] > rocksdb::WriteBatch::Iterate(rocksdb::WriteBatch::Handler*) const+0x24 > C [librocksdbjni3701435679326554484.so+0x3ac3f9] > rocksdb::WriteBatchInternal::InsertInto(rocksdb::WriteThread::WriteGroup&, > unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, > rocksdb::TrimHistoryScheduler*, bool, unsigned long, rocksdb::DB*, bool, > bool, bool)+0x249 > C [librocksdbjni3701435679326554484.so+0x2f6308] > rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, > bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x1e98 > C [librocksdbjni3701435679326554484.so+0x2f70c1] > rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, > rocksdb::WriteBatch*)+0x21 > C [librocksdbjni3701435679326554484.so+0x1dd0cc] > Java_org_rocksdb_RocksDB_write0+0xcc > j org.rocksdb.RocksDB.write0(JJJ)V+0 > J 8597 C1 > org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.putBlock(Lorg/apache/hadoop/ozone/container/common/interfaces/Container;Lorg/apache/hadoop/ozone/container/common/helpers/BlockData;Z)J > (487 bytes) @ 0x00007eff3a8dd84c [0x00007eff3a8db8e0+0x1f6c] > J 8700 C1 > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueContainer;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (211 bytes) @ 0x00007eff3a927ebc [0x00007eff3a926220+0x1c9c] > J 6685 C1 > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueHandler;Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueContainer;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (228 bytes) @ 0x00007eff3a2ba2c4 [0x00007eff3a2b7640+0x2c84] > J 6684 C1 > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/interfaces/Container;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (11 bytes) @ 0x00007eff3a29a8ac [0x00007eff3a29a740+0x16c] > J 7108 C1 > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (1105 bytes) @ 0x00007eff3a4a8324 [0x00007eff3a4a2b40+0x57e4] > J 7105 C1 > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(Ljava/lang/Object;Lorg/apache/hadoop/hdds/function/FunctionWithServiceException;Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object; > (205 bytes) @ 0x00007eff3a48158c [0x00007eff3a480220+0x136c] > J 7102 C1 > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (38 bytes) @ 0x00007eff3a472bfc [0x00007eff3a4725a0+0x65c] > J 7250 C1 > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (103 bytes) @ 0x00007eff39c397a4 [0x00007eff39c38260+0x1544] > J 7960 C1 > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$6(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext$Builder;JLjava/util/concurrent/CompletableFuture;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (81 bytes) @ 0x00007eff3a7b654c [0x00007eff3a7b6280+0x2cc] > J 7959 C1 > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine$$Lambda$572.get()Ljava/lang/Object; > (24 bytes) @ 0x00007eff3a7b2124 [0x00007eff3a7b2080+0xa4] > J 7226 C1 java.util.concurrent.CompletableFuture$AsyncSupply.run()V (61 > bytes) @ 0x00007eff3a0e672c [0x00007eff3a0e6520+0x20c] > j > java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 > j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 > j java.lang.Thread.run()V+11 > v ~StubRoutines::call_stub > V [libjvm.so+0x68868b] JavaCalls::call_helper(JavaValue*, methodHandle*, > JavaCallArguments*, Thread*)+0xddb > V [libjvm.so+0x685f53] JavaCalls::call_virtual(JavaValue*, KlassHandle, > Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x263 > V [libjvm.so+0x686517] JavaCalls::call_virtual(JavaValue*, Handle, > KlassHandle, Symbol*, Symbol*, Thread*)+0x47 > V [libjvm.so+0x6f268c] thread_entry(JavaThread*, Thread*)+0x6c > V [libjvm.so+0xa7ca9b] JavaThread::thread_main_inner()+0xdb > V [libjvm.so+0xa7cda1] JavaThread::run()+0x2d1 > V [libjvm.so+0x90dcb2] java_start(Thread*)+0x102 > C [libpthread.so.0+0x7e25] start_thread+0xc5Java frames: (J=compiled Java > code, j=interpreted, Vv=VM code) > j org.rocksdb.RocksDB.write0(JJJ)V+0 > J 8597 C1 > org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.putBlock(Lorg/apache/hadoop/ozone/container/common/interfaces/Container;Lorg/apache/hadoop/ozone/container/common/helpers/BlockData;Z)J > (487 bytes) @ 0x00007eff3a8dd84c [0x00007eff3a8db8e0+0x1f6c] > J 8700 C1 > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueContainer;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (211 bytes) @ 0x00007eff3a927ebc [0x00007eff3a926220+0x1c9c] > J 6685 C1 > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueHandler;Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/keyvalue/KeyValueContainer;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (228 bytes) @ 0x00007eff3a2ba2c4 [0x00007eff3a2b7640+0x2c84] > J 6684 C1 > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/interfaces/Container;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (11 bytes) @ 0x00007eff3a29a8ac [0x00007eff3a29a740+0x16c] > J 7108 C1 > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (1105 bytes) @ 0x00007eff3a4a8324 [0x00007eff3a4a2b40+0x57e4] > J 7105 C1 > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(Ljava/lang/Object;Lorg/apache/hadoop/hdds/function/FunctionWithServiceException;Ljava/lang/Object;Ljava/lang/String;)Ljava/lang/Object; > (205 bytes) @ 0x00007eff3a48158c [0x00007eff3a480220+0x136c] > J 7102 C1 > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (38 bytes) @ 0x00007eff3a472bfc [0x00007eff3a4725a0+0x65c] > J 7250 C1 > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (103 bytes) @ 0x00007eff39c397a4 [0x00007eff39c38260+0x1544] > J 7960 C1 > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$6(Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandRequestProto;Lorg/apache/hadoop/ozone/container/common/transport/server/ratis/DispatcherContext$Builder;JLjava/util/concurrent/CompletableFuture;)Lorg/apache/hadoop/hdds/protocol/datanode/proto/ContainerProtos$ContainerCommandResponseProto; > (81 bytes) @ 0x00007eff3a7b654c [0x00007eff3a7b6280+0x2cc] > J 7959 C1 > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine$$Lambda$572.get()Ljava/lang/Object; > (24 bytes) @ 0x00007eff3a7b2124 [0x00007eff3a7b2080+0xa4] > J 7226 C1 java.util.concurrent.CompletableFuture$AsyncSupply.run()V (61 > bytes) @ 0x00007eff3a0e672c [0x00007eff3a0e6520+0x20c] > j > java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 > j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 > j java.lang.Thread.run()V+11 > v ~StubRoutines::call_stub > {code} > *Example2: new throw std::bad_alloc while putBlock* > {code:java} > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > Missing separate debuginfo for > /opt/software/jdk1.8.0_231/jre/lib/amd64/server/libjvm.so > Missing separate debuginfo for > /opt/software/jdk1.8.0_231/jre/lib/amd64/libverify.so > Missing separate debuginfo for > /opt/software/jdk1.8.0_231/jre/lib/amd64/libmanagement.so > Core was generated by `/opt/software/jdk1.8.0_231/bin/java -Dproc_datanode > -Djava.net.preferIPv4Stack='. > Program terminated with signal 6, Aborted. > #0 0x00007f293e2b41f7 in raise () from /lib64/libc.so.6 > Missing separate debuginfos, use: debuginfo-install > glibc-2.17-196.tl2.3.x86_64 > java-1.8.0-openjdk-headless-1.8.0.71-2.b15.el7_2.x86_64 > libgcc-4.8.5-5.tl2.x86_64 libstdc++-4.8.5-5.tl2.x86_64 lz4-1.7.5-2.tl2.x86_64 > snappy-1.1.0-3.el7.x86_64 zlib-1.2.7-15.el7.x86_64 > (gdb) bt > #0 0x00007f293e2b41f7 in raise () from /lib64/libc.so.6 > #1 0x00007f293e2b58e8 in abort () from /lib64/libc.so.6 > #2 0x00007f28e6eda9d5 in __gnu_cxx::__verbose_terminate_handler() () from > /lib64/libstdc++.so.6 > #3 0x00007f28e6ed8946 in ?? () from /lib64/libstdc++.so.6 > #4 0x00007f28e6ed8973 in std::terminate() () from /lib64/libstdc++.so.6 > #5 0x00007f28e6ed8b93 in __cxa_throw () from /lib64/libstdc++.so.6 > #6 0x00007f28e6ed912d in operator new(unsigned long) () from > /lib64/libstdc++.so.6 > #7 0x00007f28e6f37c69 in std::string::_Rep::_S_create(unsigned long, > unsigned long, std::allocator<char> const&) () from /lib64/libstdc++.so.6 > #8 0x00007f28e6f37e56 in std::string::_M_mutate(unsigned long, unsigned > long, unsigned long) () from /lib64/libstdc++.so.6 > #9 0x00007f28e6f37fe6 in std::string::_M_leak_hard() () from > /lib64/libstdc++.so.6 > #10 0x00007f28e6487211 in > rocksdb::WriteBatchInternal::SetSequence(rocksdb::WriteBatch*, unsigned long) > () from /tmp/librocksdbjni4549469381011773502.so > #11 0x00007f28e648a3e0 in > rocksdb::WriteBatchInternal::InsertInto(rocksdb::WriteThread::WriteGroup&, > unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, > rocksdb::TrimHistoryScheduler*, bool, unsigned long, rocksdb::DB*, bool, > bool, bool) () from /tmp/librocksdbjni4549469381011773502.so > #12 0x00007f28e63d4308 in rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions > const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, > unsigned long, bool, unsigned long*, unsigned long, > rocksdb::PreReleaseCallback*) () from /tmp/librocksdbjni4549469381011773502.so > #13 0x00007f28e63d50c1 in rocksdb::DBImpl::Write(rocksdb::WriteOptions > const&, rocksdb::WriteBatch*) () from /tmp/librocksdbjni4549469381011773502.so > #14 0x00007f28e62bb0cc in Java_org_rocksdb_RocksDB_write0 () from > /tmp/librocksdbjni4549469381011773502.so > #15 0x00007f292ac9b57e in ?? () > #16 0x00000000b21578a8 in ?? () > #17 0x0000000093632f70 in ?? () > #18 0x0000000000000012 in ?? () > #19 0x00007f292a8a0b84 in ?? () > #20 0x01751bb038161c0c in ?? () > #21 0x0000000093632848 in ?? () > #22 0x00000000aa506db0 in ?? () > #23 0x00007f28c43f6290 in ?? () > #24 0x00000000a9bc47d8 in ?? () > #25 0x00007f292b170254 in ?? () > #26 0x00007f28c43f6280 in ?? () > #27 0x00007f293d988c48 in JVM_MonitorWait () from > /opt/software/jdk1.8.0_231/jre/lib/amd64/server/libjvm.so > Backtrace stopped: previous frame inner to this frame (corrupt stack?) > {code} > *Example 3: put key failed, but DN does not crash.* > {code:java} > 2020-10-13 11:02:11,160 [RatisApplyTransactionExecutor 1] INFO > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler: Operation: > PutBlock , Trace ID: , Message: Put Key failed , Result: IO_EXCEPTION , > StorageContainerException Occurred. > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > Put Key failed > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(KeyValueHandler.java:449) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(KeyValueHandler.java:187) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:163) > at > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:309) > at > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.lambda$dispatch$0(HddsDispatcher.java:171) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87) > at > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:170) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:400) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.runCommand(ContainerStateMachine.java:410) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$6(ContainerStateMachine.java:754) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: Unable to write the batch. > at > org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48) > at > org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:279) > at > org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.putBlock(BlockManagerImpl.java:152) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(KeyValueHandler.java:440) > ... 13 more > Caused by: org.rocksdb.RocksDBException: unknown WriteBatch tag > at org.rocksdb.RocksDB.write0(Native Method) > at org.rocksdb.RocksDB.write(RocksDB.java:1586) > at > org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46) > ... 16 more > 2020-10-13 11:02:11,439 [Datanode State Machine Thread - 0] WARN > org.apache.hadoop.ozone.container.common.statemachine.StateContext: No > available thread in pool for past 5 seconds. > 2020-10-13 11:02:11,449 [RatisApplyTransactionExecutor 1] ERROR > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine: > gid group-39F7E68E05D7 : ApplyTransaction failed. cmd PutBlock logIndex 1651 > msg : Put Key failed Container Result: IO_EXCEPTION > 2020-10-13 11:02:11,458 [RatisApplyTransactionExecutor 1] ERROR > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > pipeline Action CLOSE on pipeline > PipelineID=504be054-27bc-4c67-ae2d-39f7e68e05d7.Reason : Ratis Transaction > failure in datanode b65b0b6c-b0bb-429f-a23d-467c72d4b85c with role LEADER > .Triggering pipeline close action. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org