[ https://issues.apache.org/jira/browse/KUDU-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
yejiabao_h updated KUDU-3325: ----------------------------- Attachment: image-2021-10-06-19-23-51-769.png > When wal is deleted, fault recovery and load balancing are abnormal > ------------------------------------------------------------------- > > Key: KUDU-3325 > URL: https://issues.apache.org/jira/browse/KUDU-3325 > Project: Kudu > Issue Type: Bug > Components: consensus > Reporter: yejiabao_h > Priority: Major > Attachments: image-2021-10-06-15-36-40-996.png, > image-2021-10-06-15-36-53-813.png, image-2021-10-06-15-37-09-520.png, > image-2021-10-06-15-37-24-776.png, image-2021-10-06-15-37-42-533.png, > image-2021-10-06-15-37-54-782.png, image-2021-10-06-15-38-06-575.png, > image-2021-10-06-15-38-17-388.png, image-2021-10-06-15-38-29-176.png, > image-2021-10-06-15-38-39-852.png, image-2021-10-06-15-38-53-343.png, > image-2021-10-06-15-39-03-296.png, image-2021-10-06-19-23-51-769.png > > > h3. 1、using kudu leader step down to create multiple wal message > ./kudu tablet leader_step_down $MASTER_IP 1299f5a939d2453c83104a6db0cae3e7 > h4. wal > !image-2021-10-06-15-36-40-996.png! > h4. cmeta > !image-2021-10-06-15-36-53-813.png! > h3. 2、stop one of tserver to start tablet recovery,so that we can make > opid_index flush to cmeta > !image-2021-10-06-15-37-09-520.png! > h4. wal > !image-2021-10-06-15-37-24-776.png! > h4. cmeta > !image-2021-10-06-15-37-42-533.png! > h3. 3、stop all tservers,and delete tablet wal > !image-2021-10-06-15-37-54-782.png! > h3. 4、start all tservers > we can see the index in wal starts counting from 1, but the opid_index > recorded in cmeta is the value 20 which is before deleting wal > > h4. wal > !image-2021-10-06-15-38-06-575.png! > > h4. cmeta > !image-2021-10-06-15-38-17-388.png! > > h3. 5、stop a tserver,trigger fault recovery > !image-2021-10-06-15-38-29-176.png! > when the leader recovery a replica, and master request change raft config to > add the new replica to new raft config, leader replica while ignored because > the opindex is smaller than that in cmeta. > > h3. 6、delete all wals > !image-2021-10-06-15-38-39-852.png! > h3. 7、kudu cluster rebalance > ./kudu cluster rebalance $MASTER_IP > !image-2021-10-06-15-38-53-343.png! > !image-2021-10-06-15-39-03-296.png! > rebalance is also failed when change raft config -- This message was sent by Atlassian Jira (v8.3.4#803005)