huangluyu opened a new issue, #10518:
URL: https://github.com/apache/rocketmq/issues/10518

   ### Is Your Feature Request Related to a Problem?
   
   When ~/.rocketmq_offsets/ files are corrupted (e.g., truncated due to power 
loss during persistAll()), readLocalOffsetBak() unconditionally throws 
MQClientException, which prevents the consumer from starting and crashes the 
entire application. The error "maybe fastjson version too low" is misleading — 
the real cause is file corruption, not a version issue.
   
   For applications that already rely on CONSUME_FROM_TIMESTAMP, the persistent 
offset is non-critical — losing it does not affect data correctness. But there 
is currently no way to tell RocketMQ "just ignore corrupted offset files and 
start fresh."
   
   ### Describe the Solution You'd Like
   
   Add a client configuration property, e.g.:
   
   ```
   # Default false — keep existing behavior (throw on corruption)
     rocketmq.client.localOffsetStore.ignoreCorrupted=false
   ``` 
   
   When set to true, readLocalOffsetBak() should return null instead of 
throwing when JSON parsing fails:
   
   ```
   private OffsetSerializeWrapper readLocalOffsetBak() throws MQClientException 
{
         // ...
         if (content != null && content.length() > 0) {
             try {
                 offsetSerializeWrapper = 
OffsetSerializeWrapper.fromJson(content, ...);
             } catch (Exception e) {
                 log.warn("readLocalOffset Exception", e);
                 if (ignoreCorrupted) {
                     return null;
                 }
                 throw new MQClientException("readLocalOffset Exception, ...", 
e);
             }
         }
         return null;
     }
   ```
   
   When ignoreCorrupted=true and offset files are corrupted, load() starts with 
an empty offset table, and the consumer falls back to CONSUME_FROM_TIMESTAMP 
(or the latest message if no timestamp is set). Behavior only differs from the 
default on startup after file corruption —normal operation is unaffected.
   
   
   ### Describe Alternatives You've Considered
   
    - Always return null (no config): Too aggressive — changes default behavior 
for all existing users who may depend on the exception for fast-fail
     semantics.
     - Application-level workaround: Manually delete ~/.rocketmq_offsets/ 
before each consumer start. Effective but fragile — requires every affected
     application to implement its own cleanup logic. A built-in opt-in config 
is cleaner and discoverable.
     - System property -Drocketmq.client.localOffsetStoreDir to a tmpfs: 
Redirects offsets to volatile storage, avoiding persistence entirely. Works but
     throws away offsets for all consumers globally, which may be undesirable 
in mixed-use cases.
   
   ### Additional Context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to