[ 
https://issues.apache.org/jira/browse/HDDS-8267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704600#comment-17704600
 ] 

Wei-Chiu Chuang commented on HDDS-8267:
---------------------------------------

fwiw the log file was not missing. It was deliberately deleted by rocksdb. 
Rocksdb itself should have the ability to tell if its wal log is deleted or 
not. IMO it's a bug in rocksdb.

> getOMDBUpdates requests crashed Ozone Manager
> ---------------------------------------------
>
>                 Key: HDDS-8267
>                 URL: https://issues.apache.org/jira/browse/HDDS-8267
>             Project: Apache Ozone
>          Issue Type: Bug
>    Affects Versions: 1.4.0
>            Reporter: Wei-Chiu Chuang
>            Priority: Critical
>         Attachments: HDDS-8267-ozone-om.log, LOG, LOG.old.1679288337380113, 
> LOG.old.1679301915549029, LOG.old.1679315208126016, LOG.old.1679350165641592, 
> hs_err_pid195294.log, stderr.log
>
>
> An OM crashed, its log has:
> {noformat}
> 2023-03-20 23:29:11,751 ERROR org.apache.hadoop.hdds.utils.db.RDBStore: 
> Unable to get delta updates since sequenceNumber 98273. This exception will 
> not be thrown to the client
> java.io.IOException: RocksDatabase[/var/lib/hadoop-ozone/om/data/om.db]: 
> Failed to getUpdatesSince 98273; status : IOError(Undefined); message : while 
> stat a file for size: /var/lib/hadoop-ozone/om/data/om.db/000121.log: No such 
> file or directory
>         at 
> org.apache.hadoop.hdds.utils.HddsServerUtil.toIOException(HddsServerUtil.java:576)
>         at 
> org.apache.hadoop.hdds.utils.db.RocksDatabase.toIOException(RocksDatabase.java:85)
>         at 
> org.apache.hadoop.hdds.utils.db.RocksDatabase.getUpdatesSince(RocksDatabase.java:724)
>         at 
> org.apache.hadoop.hdds.utils.db.RDBStore.getUpdatesSince(RDBStore.java:368)
>         at 
> org.apache.hadoop.ozone.om.OzoneManager.getDBUpdates(OzoneManager.java:3975)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.getOMDBUpdates(OzoneManagerRequestHandler.java:354)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleReadRequest(OzoneManagerRequestHandler.java:233)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(OzoneManagerProtocolServerSideTranslatorPB.java:223)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:177)
>         at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:147)
>         at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894)
> Caused by: org.rocksdb.RocksDBException: while stat a file for size: 
> /var/lib/hadoop-ozone/om/data/om.db/000121.log: No such file or directory
>         at org.rocksdb.RocksDB.getUpdatesSince(Native Method)
>         at org.rocksdb.RocksDB.getUpdatesSince(RocksDB.java:3966)
>         at 
> org.apache.hadoop.hdds.utils.db.RocksDatabase.getUpdatesSince(RocksDatabase.java:721)
>         ... 17 more
> {noformat}
> There is no 000121.log under the directory for sure.
> {noformat}
> ...
> -rw-r--r-- 1 hdfs hdfs      1219 Mar 20 22:07 000091.sst
> -rw-r--r-- 1 hdfs hdfs     36002 Mar 20 22:07 000092.sst
> -rw-r--r-- 1 hdfs hdfs     86659 Mar 20 22:07 000093.sst
> -rw-r--r-- 1 hdfs hdfs 137260115 Mar 20 23:28 000166.log
> -rw-r--r-- 1 hdfs hdfs     23833 Mar 20 23:28 000167.sst
> -rw-r--r-- 1 hdfs hdfs      1221 Mar 20 23:28 000168.sst
> -rw-r--r-- 1 hdfs hdfs     11350 Mar 20 23:28 000169.sst
> -rw-r--r-- 1 hdfs hdfs      1888 Mar 20 23:28 000170.sst
> -rw-r--r-- 1 hdfs hdfs      6001 Mar 20 23:28 000171.sst
> -rw-r--r-- 1 hdfs hdfs     10118 Mar 20 23:28 000172.sst
> -rw-r--r-- 1 hdfs hdfs      1330 Mar 20 23:28 000173.sst
> -rw-r--r-- 1 hdfs hdfs    303259 Mar 20 23:28 000175.sst
> -rw-r--r-- 1 hdfs hdfs 295146251 Mar 20 23:29 000176.log
> -rw-r--r-- 1 hdfs hdfs     90863 Mar 20 23:28 000177.sst
> {noformat}
> The stderr has the following rocksdb error repeatedly:
> {noformat}
> Exception in thread "Thread-835" java.lang.IllegalArgumentException: Illegal 
> value provided for FlushReason: 13
>       at org.rocksdb.FlushReason.fromValue(FlushReason.java:51)
>       at org.rocksdb.FlushJobInfo.<init>(FlushJobInfo.java:41)
> Exception in thread "Thread-837" java.lang.IllegalArgumentException: Illegal 
> value provided for FlushReason: 13
>       at org.rocksdb.FlushReason.fromValue(FlushReason.java:51)
>       at org.rocksdb.FlushJobInfo.<init>(FlushJobInfo.java:41)
> Exception in thread "Thread-836" java.lang.IllegalArgumentException: Illegal 
> value provided for FlushReason: 13
>       at org.rocksdb.FlushReason.fromValue(FlushReason.java:51)
>       at org.rocksdb.FlushJobInfo.<init>(FlushJobInfo.java:41)
> Exception in thread "Thread-842" java.lang.IllegalArgumentException: Illegal 
> value provided for FlushReason: 13
>       at org.rocksdb.FlushReason.fromValue(FlushReason.java:51)
>       at org.rocksdb.FlushJobInfo.<init>(FlushJobInfo.java:41)
> {noformat}
> There is a crash report file hs_err_pid195294.log:
> {noformat}
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j  org.rocksdb.RocksDB.getLatestSequenceNumber(J)J+0
> j  org.rocksdb.RocksDB.getLatestSequenceNumber()J+5
> j  org.apache.hadoop.hdds.utils.db.RocksDatabase.getLatestSequenceNumber()J+18
> j  
> org.apache.hadoop.hdds.utils.db.RDBStore.getUpdatesSince(JJ)Lorg/apache/hadoop/hdds/utils/db/DBUpdatesWrapper;+395
> j  
> org.apache.hadoop.ozone.om.OzoneManager.getDBUpdates(Lorg/apache/hadoop/ozone/protocol/proto/OzoneManagerProtocolProtos$DBUpdatesRequest;)Lorg/apache/hadoop/ozone/o
> m/helpers/DBUpdates;+30
> j  
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.getOMDBUpdates(Lorg/apache/hadoop/ozone/protocol/proto/OzoneManagerProtocolProtos$DBUpdatesRequest;)Lo
> rg/apache/hadoop/ozone/protocol/proto/OzoneManagerProtocolProtos$DBUpdatesResponse;+9
> j  
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleReadRequest(Lorg/apache/hadoop/ozone/protocol/proto/OzoneManagerProtocolProtos$OMRequest;)Lorg/a
> pache/hadoop/ozone/protocol/proto/OzoneManagerProtocolProtos$OMResponse;+533
> J 19482 C1 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(Lorg/apache/hadoop/ozone/protocol/proto/OzoneManagerProt
> ocolProtos$OMRequest;)Lorg/apache/hadoop/ozone/protocol/proto/OzoneManagerProtocolProtos$OMResponse;
>  (45 bytes) @ 0x00007f37b24d48e4 [0x00007f37b24d44a0+0x444]
> J 17472 C1 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(Lorg/apache/hadoop/ozone/protocol/proto/OzoneManagerProtocolPro
> tos$OMRequest;)Lorg/apache/hadoop/ozone/protocol/proto/OzoneManagerProtocolProtos$OMResponse;
>  (214 bytes) @ 0x00007f37b3789ac4 [0x00007f37b3788ea0+0xc24]
> J 17794 C1 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB$$Lambda$468.apply(Ljava/lang/Object;)Ljava/lang/Object;
>  (12 bytes) @ 0x00007f3
> 7b163c124 [0x00007f37b163bfc0+0x164]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

Reply via email to