[ 
https://issues.apache.org/jira/browse/HDFS-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867332#action_12867332
 ] 

Hairong Kuang commented on HDFS-1112:
-------------------------------------

It turns out that the edit output stream for backup NN does not serialize edit 
logs when an edit log entry is written. So there is no way to implement 
getUnflushedDataLen(). Here is a minor change to the proposal. Instead, add a 
method boolean isTimeToSync() to EditLogOutputStream. isTimeToSync() allows 
each edit stream to implement its own automatic sync policy, either by size or 
by time. By default, no automatic sync is supported. EditLogOutputStream 
returns true when the buffered data length is greater than the initial size.

> Edit log buffer should not grow unboundedly
> -------------------------------------------
>
>                 Key: HDFS-1112
>                 URL: https://issues.apache.org/jira/browse/HDFS-1112
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.22.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.22.0
>
>
> Currently HDFS does not impose an upper limit on the edit log buffer. In case 
> there are a large number of open operations coming in with access time update 
> on, since open does not call sync automatically, there is a possibility that 
> the buffer grow to a large size, therefore causes memory leak and full GC in 
> extreme cases as described in HDFS-1104. 
> The edit log buffer should be automatically flushed when the buffer becomes 
> full.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to