[ 
https://issues.apache.org/jira/browse/HDFS-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922747#comment-16922747
 ] 

Konstantin Shvachko commented on HDFS-14277:
--------------------------------------------

I think for 2.10 we should mostly target Daren's comment that there should be 
no regression when Observers are not used.
It is important to fix performance issues with Observers, but it should not 
block the release, since with more NameNodes the overall performance will 
improve anyways.
So I propose to target HDFS-14822 as 2.10 release blocker, and drop the blocker 
label here.

> [SBN read] Observer benchmark results
> -------------------------------------
>
>                 Key: HDFS-14277
>                 URL: https://issues.apache.org/jira/browse/HDFS-14277
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: ha, namenode
>    Affects Versions: 2.10.0, 3.3.0
>         Environment: Hardware: 4-node cluster, each node has 4 core, Xeon 
> 2.5Ghz, 25GB memory.
> Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, 
> RPC encryption + Data Transfer Encryption, Cloudera Navigator.
>            Reporter: Wei-Chiu Chuang
>            Priority: Blocker
>              Labels: release-blocker
>         Attachments: Observer profiler.png, Screen Shot 2019-02-14 at 
> 11.50.37 AM.png, observer RPC queue processing time.png
>
>
> Ran a few benchmarks and profiler (VisualVM) today on an Observer-enabled 
> cluster. Would like to share the results with the community. The cluster has 
> 1 Observer node.
> h2. NNThroughputBenchmark
> Generate 1 million files and send fileStatus RPCs.
> {code:java}
> hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
> <namenode>  -op fileStatus -threads 100 -files 1000000 -useExisting 
> -keepResults
> {code}
> h3. Kerberos, SSL, RPC encryption, Data Transfer Encryption enabled:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|4865|
> |Observer|3996|
> h3. Kerberos, SSL:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|7078|
> |Observer|6459|
> Observation:
>  * due to the edit tailing overhead, Observer node consume 30% CPU 
> utilization even if the cluster is idle.
>  * While Active NN has less than 1ms RPC processing time, Observer node has > 
> 5ms RPC processing time. I am still looking for the source of the longer 
> processing time. The longer RPC processing time may be the cause for the 
> performance degradation compared to that of Active NN. Note the cluster has 
> Cloudera Navigator installed which adds additional overhead to RPC processing 
> time.
>  * {{GlobalStateIdContext#isCoordinatedCall()}} pops up as one of the top 
> hotspots in the profiler. 
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to