[ https://issues.apache.org/jira/browse/IOTDB-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091091#comment-17091091 ]
Houliang Qi commented on IOTDB-606: ----------------------------------- The operations that can cause the contents of the partition table to change in the system are as follows: 1. Add a node; 2. Remove a node; The main reasons when a node needs to pull metasnapshot are as follows; 1. New nodes are added: 2. Restart after downtime, the meta information of this node is far away from the leader. 3. The new network partition node rejoins the cluster, the meta information of this node has been far away from the leader. For 1, no request will come before the new partition table is applied. So just apply the partition table directly. For 2 and 3, if the request is being routed to this node, because the partition table information is old, the metadata obtained by the metamember or datamember is also wrong. In this case, the operation will definitely fail, so let the upper layer retry. This node can directly replace the partition table. Before the replacement of the partition table is completed, all operations are blocked(emptying flow). The above is to consider the case of adding only one node or deleting one node at a time. Let us consider the case of adding or deleting multiple nodes. Since all operations are performed sequentially at the leader node, the leader has the newest partition table, Raft guarantees that the partition table given by the leader to the follower must be accurate. So in this case, for follower, it is the same as the addition and deletion of a node. Please leave your opinion, thanks. > [Distributed] Replace raw logs in MetaSnapshot > ---------------------------------------------- > > Key: IOTDB-606 > URL: https://issues.apache.org/jira/browse/IOTDB-606 > Project: Apache IoTDB > Issue Type: Improvement > Reporter: Tian Jiang > Priority: Major > Labels: cluster, metadata, snapshot > > The current MetaSnapshot is using the simplest way, storing the raw committed > logs. It would be more efficient to replace the logs with compact structures > like the partition table and other objects that will be affected by meta logs. -- This message was sent by Atlassian Jira (v8.3.4#803005)