saxenapranav commented on code in PR #5450: URL: https://github.com/apache/hadoop/pull/5450#discussion_r1124593024
########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/LowRedundancyBlocks.java: ########## @@ -369,7 +369,7 @@ synchronized boolean remove(BlockInfo block, * @return true if the block was found and removed from one of the priority * queues */ - boolean remove(BlockInfo block, int priLevel) { + synchronized boolean remove(BlockInfo block, int priLevel) { Review Comment: > > This will just ensure only one thread getting into the remove method. But what if there is one thread trying to remove, other trying to add or size(). > > add() and size() are already synchronized in the current code. > > 1. https://github.com/apache/hadoop/blob/01027e52a9789eb5b386729f52b7bb9e52fa5352/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/LowRedundancyBlocks.java#L290 > 2. https://github.com/apache/hadoop/blob/01027e52a9789eb5b386729f52b7bb9e52fa5352/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/LowRedundancyBlocks.java#L125 these method being synchronized means that only one thread can enter into the synchronized method, but doesn't make the object its working on synchronized. There could be one thread calling size which leads to `priorityQueues.get(i).size()`, and other calling add which leads to `priorityQueues.get(priLevel).add(blockInfo)` simultaneously. What probable issue is: for adding element https://github.com/apache/hadoop/blob/ccdb978603696a91c4828a66aad27f585543b376/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java#L87-L125 is the code. let say thread goes till https://github.com/apache/hadoop/blob/ccdb978603696a91c4828a66aad27f585543b376/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java#L101 and then context switch happens and the thread doesn't get CPU in future for some time. Now size has been incremented but addition has not happend. the thread calling size() gets the CPU, it will get size which is more than what exactly is in the set. Thats why feel that priorityQueues is still not thread-safe and hence suggesting the change. Feel free to disagree. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org