ChenSammi commented on PR #10109: URL: https://github.com/apache/ozone/pull/10109#issuecomment-4421721539
> > Generally we should choose the one with less complexity, and less cost. If replacing the fields inline, there are at least 4 fields' swaps, one by one. If replacing the container reference, there is one swap. Basically, hot swap one pointer has less complex result than hot swap four pointers, right? > > I don't think updating fields in an object, which can be done atomically under an existing lock is complex. > > Swapping the object for another is simple, but look at this bug it causes and the fix which is bleeding out into lots of code areas. And even after this fix, if someone adds a new service, they have to know they need to do these strange things (get a container, then get it again and check it hasn't changed) to avoid a race condition they shouldn't really be concerned about. > > If we fix this at source, by locking the container object, updating the fields atomically and unlocking, then its fixed once and for all in a single place @sodonnel , currently we swap one container pointer under container's read lock protection, we got one stale container reference. If we swap four fields under the same container's read lock protection, we got four stale fields reference. We need to double check all code places where these four stale fields reference can happen to know how many new race conditions and what's their consequences, they might be less severe or more severe than this jira. For example, the chunk read doesn't hold any container lock, once the thread has the container object, it will assume volume, chunk, db path will be not changed. If multiple fields are swapped, it could end up with some path point to new location, some path point to old location, which seems more dangerous than the issues described by this jira, which are mainly there could be some blocks left in the replica, or some replicas are left not deleted in the first place(I believe this replica will be eventually deleted, a new replica for a deleted container will be deleted eventually). And even if we implement the swap four fields solution, it's also not once and for all, every new code added, we also need to verify if there is no stale reference case. The root cause of issue described in this JIRA, is because we do the container pointer swap in containerSet under a read lock protection after container is moved to destination volume, if it's under a write lock protection, then some blocks left in the replica, or some replicas are left not deleted in the first place will not happen. But consider moving a container from one volume might take a substantial time, we cannot afford to write lock the container, as it will make other threads waiting for operation on this container wait for a considerable amount of time. The current approach in this jira, is not a perfect approach, the perfect approach is use write lock, but it's a approach with relative low complexity, as double check after get lock is common in concurrent environment, easy to understand, and it's easy to find out where the write lock is held, and do the double check after the lock acquired, which is easier than identify all the places where volume, db, chunk path, meta path are used, and handle the stale reference. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
