dlmarion opened a new issue, #4420:
URL: https://github.com/apache/accumulo/issues/4420

   The documentation for the `srv:lock` column is:
   ```
   srv:lock [] tservers/127.0.0.1:9997/zlock-0000000001\$13fe86cd27101e5 - This 
is the lock information for the tablet holding the present lock. This 
information is checked against zookeeper whenever this is updated, which 
prevents a metadata update from a tablet server that no longer holds its lock.
   ```
   
   In `main`, the `srv:lock` column is set via `TabletMutator.putZooLock` in 
the client code, and it's checked in `MetadataConstraints.check` on the server 
side. `MetadataConstraints` checks that the lock value set in the Mutation is 
actually a valid lock in ZooKeeper. The ZooKeeper lock is used for validation 
instead of the location as a TabletServer could be restarted on the same host 
and end up with the same address, but the lock would not be the same. It's 
possible that  TabletServer A could send a mutation to  TabletServer B to 
update the metadata table for a Tablet that TabletServer A is hosting, and then 
TabletServer A could fail right after sending the mutation.
   
   In `main`, `TabletMutator.putZooLock` is called from:
   ```
   ManagerMetadataUtil.addNewTablet
   ManagerMetadataUtil.replaceDatafiles
   ManagerMetadataUtil.updateTabletDataFile
   MetadataTableUtil.addTablet
   MetadataTableUtil.removeScanFiles
   MetadataTableUtil.removeUnusedWALEntries
   MetadataTableUtil.updateTabletCompactID
   MetadataTableUtil.updateTabletDataFile
   MetadataTableUtil.updateTabletFlushID
   MetadataTableUtil.updateTabletVolumes
   ```
   
   In `elasticity` the `putZooLock` method moved to `TabletUpdates` and is 
called from:
   ```
   TabletGroupWatcher.replaceVolumes
   PopulateMetadata.writeSplitsToMetadataTable
   MetadataTableUtil.removeScanFiles
   MetadataTableUtil.removeUnusedWALEntries
   MetadataTableUtil.updateTabletDataFile
   Tablet.flush
   Tablet.updateTabletDataFile
   ```
   
   The differences in these lists makes me question whether if we missed 
something when updating the code. I'm also wondering in which cases the 
ServiceLock should *not* be checked when updating that metadata and root tables 
- should it always be checked?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to