[ 
https://issues.apache.org/jira/browse/HDFS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732990#comment-13732990
 ] 

Aaron T. Myers commented on HDFS-5064:
--------------------------------------

bq. I think most of writers on SBN are datanodes. If this is true, separating 
FSN and BlockManager locking will help. Last time I checked, we wanted a 
facility to enforce lock hierarchy before attempting to do this.

Yes, this would be great. I took a look into this and concluded that it would 
be a pretty complex change as it currently stands. I also think that it won't 
necessarily completely address the issue, since there are several threads which 
periodically wake up in the NN and try to get the FSNS write lock, and if this 
occurs while the FSNS read lock is held for a long time, we'll be right back in 
this situation.

bq. Or we could resort to a SBN-scpecific solution, since it probably only 
needs to block EditLogTailer and perhaps prevent concurrent checkpointing.

This is what I was thinking. I chatted about this offline with Todd Lipcon, and 
we came up with the following idea:

* Add new methods {{FSNS#readLock(boolean longRead)}} and 
{{FSNS#readUnlock(boolean longRead)}}. When these are called, if the argument 
longRead is true, an AtomicInteger will be incremented or decremented to 
indicate that the read lock may be held by this thread for a long time.
* We change {{FSNS#writeLock()}} to check if the aforementioned counter is 
greater than zero, which would indicate that there's a long reader currently 
holding the lock. In this case, instead of immediately trying to get the write 
lock, the thread will {{wait(...)}} in a loop until the counter returns to zero.
* We will change the handful of places in the SBN where we know the read lock 
may be held for a long time to indicate that these are long reads. The edit log 
tailer and the checkpointing thread are the two places that come to mind, as 
Kihwal mentioned.
* All other operations will still take the (short) read lock as normal.

This should allow short reads to proceed concurrently with the reads which we 
have now identified in the code as potentially taking a long time.

What do you think of this proposal, Kihwal?
                
> Standby checkpoints should not block concurrent readers
> -------------------------------------------------------
>
>                 Key: HDFS-5064
>                 URL: https://issues.apache.org/jira/browse/HDFS-5064
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, namenode
>    Affects Versions: 2.1.1-beta
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>
> We've observed an issue which causes fetches of the {{/jmx}} page of the NN 
> to take a long time to load when the standby is in the process of creating a 
> checkpoint.
> Even though both creating the checkpoint and gathering the statistics for 
> {{/jmx}} take only the FSNS read lock, the issue is that since the FSNS uses 
> a _fair_ RW lock, a single writer attempting to get the lock will block all 
> threads attempting to get only the read lock for the duration of the 
> checkpoint. This will cause {{/jmx}}, and really any thread only attempting 
> to get the read lock, to block for the duration of the checkpoint, even 
> though they should be able to proceed concurrently with the checkpointing 
> thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to