Hi dev, I'd like to draw your attention to an existing issue in our current read consistency level within the RatisConsensus module. As it stands, the default level is set to "query statemachine directly”, which, while latency-friendly, has led to user-reported bugs. Specifically, these bugs relate to the production of inconsistent results in subsequent SQL queries during a restart, creating a phantom read problem that may be confusing for our users.
To address this issue, I propose that we temporarily increase the read consistency level to linearizable read during restarts. This will ensure that we maintain data consistency during the critical recovery period. Once the cluster has successfully finished recovering from previous logs, we can then revert to the default consistency level. You can find more details about this proposed solution in the linked pull request: https://github.com/apache/iotdb/pull/10597。 **Please note** that this change may affect module (including CQ, schema region, and data region) that calls RatisConsensus.read during the restart process. In such cases, a RatisUnderRecoveryException may be returned, indicating that RatisConsensus cannot serve read requests while it's replaying RaftLog. Therefore, we strongly encourage the affected modules to handle this situation appropriately, such as implementing a retry mechanism. I look forward to hearing your thoughts on this proposal. Your feedback and suggestions will be appreciated. Regards William Song
