[ 
https://issues.apache.org/jira/browse/HDFS-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290731#comment-14290731
 ] 

Chris Nauroth commented on HDFS-7609:
-------------------------------------

[~kihwal] and [~mingma], thank you for the additional details.  It looks like 
in your case, you noticed the slowdown in the standby NN tailing the edits.  I 
had focused on profiling NN process startup as described in the original 
problem report.  I'll take a look at the standby too.

{{PriorityQueue#remove}} is O\(n\), so that definitely could be problematic.  
It's odd that there would be so many collisions that this would become 
noticeable though.  Are any of you running a significant number of legacy 
applications linked to the RPC code before introduction of the retry cache 
support?  If that were the case, then perhaps a huge number of calls are not 
supplying a call ID, and then the NN is getting a default call ID value from 
protobuf decoding, thus causing a lot of collisions.

bq. If PriorityQueue.remove() took much time, can we utilize 
PriorityQueue.removeAll(Collection) so that multiple CacheEntry's are removed 
in one round ?

Unfortunately, I don't think our usage pattern is amenable to that change.  We 
apply transactions one by one.  Switching to {{removeAll}} implies a pretty big 
code restructuring to batch up retry cache entries before the calls into the 
retry cache.  Encountering a huge number of collisions is unexpected, so I'd 
prefer to investigate that.

> startup used too much time to load edits
> ----------------------------------------
>
>                 Key: HDFS-7609
>                 URL: https://issues.apache.org/jira/browse/HDFS-7609
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.2.0
>            Reporter: Carrey Zhan
>         Attachments: HDFS-7609-CreateEditsLogWithRPCIDs.patch, 
> recovery_do_not_use_retrycache.patch
>
>
> One day my namenode crashed because of two journal node timed out at the same 
> time under very high load, leaving behind about 100 million transactions in 
> edits log.(I still have no idea why they were not rolled into fsimage.)
> I tryed to restart namenode, but it showed that almost 20 hours would be 
> needed before finish, and it was loading fsedits most of the time. I also 
> tryed to restart namenode in recover mode, the loading speed had no different.
> I looked into the stack trace, judged that it is caused by the retry cache. 
> So I set dfs.namenode.enable.retrycache to false, the restart process 
> finished in half an hour.
> I think the retry cached is useless during startup, at least during recover 
> process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to