Anatoly created HDFS-17421:
------------------------------
Summary: Check the correctness of the calculation
RpcAuthentication*
Key: HDFS-17421
URL: https://issues.apache.org/jira/browse/HDFS-17421
Project: Hadoop HDFS
Issue Type: Test
Components: hdfs, metrics
Reporter: Anatoly
I wanted to calculate the load on the KDC in the hadoop cluster after enabling
kerberos.
There are two parameters in hdfs metrics
{code:java}
RpcAuthenticationSuccesses - Total number of authentication successes
RpcAuthenticationFailures - Total number of authentication failures
{code}
I expect that any request to the cluster will generate a request to KDC -> get
ticket and the request counter should trigger either +1 to one metric if
successful or +1 to another metric if failed
However, on the test cluster, where I have 4 data Nodes and 2 NameNodes (HA), I
see completely different indicators for these metrics.
I noticed that the RpcAuthenticationSuccesses readings are gradually increasing
= +1 in 30 seconds
For example, before a test in a cluster
# only HDFS-\{NN,DN,JN,ZKFC} and YARN-\{RM,NM} services work
# All other components that were enabled – hive, spark HistoryServer are
disabled
# There are no YARN jobs running and no user requests to hdfs
At the time of the test, the value of the metrics
RpcAuthenticationFailures = 0
RpcAuthenticationSuccesses = 208322
*TEST 1*
To check the load, I run the test spark submit spark-examples_2.12-3.5.0.jar
with num-executors 1
The request was executed for 1 min 20 sec
RpcAuthenticationSuccesses = 208338
In total, 16 points were added during the execution time
+2 can be attributed to those +1 times in 30 seconds. But what does +14 points
mean?
*TEST 2*
RpcAuthenticationFailures = 0
RpcAuthenticationSuccesses = 208388
hdfs dfs -ls /
RpcAuthenticationFailures = 0
RpcAuthenticationSuccesses = 208389
*TEST 3*
Turned off -
all DN
Standby NN
All YARN services
I still have
Three JN, ZKFC
One NN Active
The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric
every 30 seconds
Either I don't understand the meaning of these metrics correctly or something
is not considered right
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]