ruojieranyishen opened a new issue, #1562: URL: https://github.com/apache/incubator-pegasus/issues/1562
## Bug Report Please answer these questions before submitting your issue. Thanks! 1. What did you do? In the production environment, Pegasus performs bulkload involving rocksdb ingestion. Since write_global_seqno is true, rocksdb will modify the rocksdb.external_sst_file.global_seqno field in the external sstable file during ingestion. The modified is sometimes inaccurate. Later, when rocksdb verifies the field legality, it will cause a coredump. 2. What did you expect to see? Hope rocksdb will not coredump. 3. What did you see instead? When the `rocksdb.external_sst_file.global_seqno` field is inaccurate, rocksdb will coredump. The error message is: ```c++ log.1741.txt:71521:D2023-04-23 08:28:50.578 (1682209730578970297 110362) replica.compact7.040500143ab21610: pegasus_server_impl.cpp:2888:do_manual_compact(): [[email protected]:37801] finish CompactRange, status = Corruption: An external sst file with version 2 have global seqno property with value �, while largest seqno in the file is 64789, time_used = 18ms ``` The location of the rocksdb source code is: ```c++ uint32_t version = DecodeFixed32(version_pos->second.c_str()); if (version < 2) { if (seqno_pos != props.end() || version != 1) { std::array<char, 200> msg_buf; // This is a v1 external sst file, global_seqno is not supported. snprintf(msg_buf.data(), msg_buf.max_size(), "An external sst file with version %u have global seqno " "property with value %s", version, seqno_pos->second.c_str()); return Status::Corruption(msg_buf.data()); } return Status::OK(); } ``` 4. What version of Pegasus are you using? Not related to Pegasus version. ## Solution After research, I verified that `write_global_seqno` set to false can solve this problem. The `write_global_seqno` is used to control whether rocksdb modifies `rocksdb.external_sst_file.global_seqno` during ingest process. After I set `write_global_seqno` to false, the above problem will not occur. Higher versions of rocksdb recommend this. The following are conclusions: - **Disadvantages**: Not compatible with versions prior to rocksdb 5.16. - `write_global_seqno=false` will give up modify the external sst file. rocksdb.external_sst_file.global_seqno will always be zero. The global seqno information is retained by MANIFEST (smallest_seqno and largest_seqno field), and no additional performance overhead will be generated. In rocksdb, the order of external files and internal files is still identified through global seqno. - `write_global_seqno=false` does not conflict with the ingest_behind function of bulkload. Read, write, and delete operations are performed normally. The sst file is no longer modified during the ingest process. Speed up the ingest speed and check the sst file through checksum. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
