[jira] [Commented] (HDFS-10817) Add Logging for Long-held NN Read Locks

Zhe Zhang (JIRA) Wed, 31 Aug 2016 10:37:36 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452844#comment-15452844
 ]


Zhe Zhang commented on HDFS-10817:
----------------------------------

Thanks Erik for the updates.

I'm not very familiar with {{ThreadLocal}} either. Thanks for the pointer and I 
agree it's a good idea to {{remove()}}. This way when the thread is recycled by 
the thread pool, the value will be reset.

One corner case is when {{readUnlock}} is called before any {{readLock}}. Will 
we see an NPE? Since we are not overriding {{ThreadLocal#initialValue}}.

bq. 5. This actually isn't possible under the current FSNamesystemLock design. 
Within FSNamesystemLock, the methods to actually lock are not exposed. It 
simply has methods writeLock and readLock which return the respective lock, 
then the calling class uses the lock/lockIinterruptibly methods on the returned 
objects. 
Good thoughts. I was initially thinking about moving the time-tracking logic 
into {{FSNamesystemLock}}. E.g. we can move {{writeLockHeldTimeStamp}} and 
{{readLockHeldTimeStamp}} into {{FSNamesystemLock}}. The lock class can take 
care of the time-tracking logic. This way {{lock}} and {{lockIinterruptibly}} 
can share the logic and the time-tracking logic can be unit tested more easily. 
We can also consider consolidating the threshold and {{needReport}} logic there 
too. Maybe also add richer APIs to the {{FSNamesystemLock}} class like 
reporting the overall lock held time of a thread (instead of a single reentrant 
period). But I think we should discuss all these as a separate follow-on 
anyway. Also, since we are printing the stack trace, the methods which actually 
acquire the lock are still logged right?

The test code looks very good.

So +1 after we clarify the {{readUnlock}} NPE concern.

> Add Logging for Long-held NN Read Locks
> ---------------------------------------
>
>                 Key: HDFS-10817
>                 URL: https://issues.apache.org/jira/browse/HDFS-10817
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: logging, namenode
>            Reporter: Erik Krogen
>            Assignee: Erik Krogen
>         Attachments: HDFS-10817.000.patch, HDFS-10817.001.patch, 
> HDFS-10817.002.patch, HDFS-10817.003.patch
>
>
> Right now the Namenode will log when a write lock is held for a long time to 
> help tracks methods which are causing expensive delays. Let's do the same for 
> read locks since these operations may also be expensive/long and cause 
> delays. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10817) Add Logging for Long-held NN Read Locks

Reply via email to