[ 
https://issues.apache.org/jira/browse/HDFS-16939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696193#comment-17696193
 ] 

ASF GitHub Bot commented on HDFS-16939:
---------------------------------------

saxenapranav commented on code in PR #5450:
URL: https://github.com/apache/hadoop/pull/5450#discussion_r1124593024


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/LowRedundancyBlocks.java:
##########
@@ -369,7 +369,7 @@ synchronized boolean remove(BlockInfo block,
    * @return true if the block was found and removed from one of the priority
    *         queues
    */
-  boolean remove(BlockInfo block, int priLevel) {
+  synchronized boolean remove(BlockInfo block, int priLevel) {

Review Comment:
   > > This will just ensure only one thread getting into the remove method. 
But what if there is one thread trying to remove, other trying to add or size().
   > 
   > add() and size() are already synchronized in the current code.
   > 
   > 1. 
https://github.com/apache/hadoop/blob/01027e52a9789eb5b386729f52b7bb9e52fa5352/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/LowRedundancyBlocks.java#L290
   > 2. 
https://github.com/apache/hadoop/blob/01027e52a9789eb5b386729f52b7bb9e52fa5352/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/LowRedundancyBlocks.java#L125
   
   these method being synchronized means that only one thread can enter into 
the synchronized method, but doesn't make the object its working on 
synchronized.  There could be one thread calling size which leads to 
`priorityQueues.get(i).size()`, and other calling add which leads to 
`priorityQueues.get(priLevel).add(blockInfo)` simultaneously.
   Example of probable issue is:
   for adding element 
https://github.com/apache/hadoop/blob/ccdb978603696a91c4828a66aad27f585543b376/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java#L87-L125
 is the code. let say thread goes till 
https://github.com/apache/hadoop/blob/ccdb978603696a91c4828a66aad27f585543b376/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java#L101
 and then context switch happens and the thread doesn't get CPU in future for 
some time. Now size has been incremented but addition has not happend. the 
thread calling size() gets the CPU, it will get size which is more than what 
exactly is in the set. Thats why feel that priorityQueues is still not 
thread-safe and hence suggesting the change.
   Please feel free to disagree.
   
   Thanks.





> Fix the thread safety bug in LowRedundancyBlocks
> ------------------------------------------------
>
>                 Key: HDFS-16939
>                 URL: https://issues.apache.org/jira/browse/HDFS-16939
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namanode
>            Reporter: Shuyan Zhang
>            Assignee: Shuyan Zhang
>            Priority: Major
>              Labels: pull-request-available
>
> The remove method in LowRedundancyBlocks is not protected by synchronized. 
> This method is private and is called by BlockManager. As a result, 
> priorityQueues has the risk of being accessed concurrently by multiple 
> threads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to