Hi
 
There is a question about two hdfs metrics that arose as a result of my attempts to calculate the load on the KDC for an industrial cluster
 
There are two parameters in hdfs metrics
RpcAuthenticationSuccesses - Total number of successful authentication attempts
RpcAuthenticationFailures - Total number of authentication failures
 
I expect that any data request in the hadoop cluster will commit
the request to KDC -> get ticket,
the request to the NameNode
after which the request counter should activate either +1 to the metric if successful, or +1 to the metric if unsuccessful
 
However, in a test cluster where I have
4 DataNodes and 2 NameNodes (HA), I see completely incomprehensible indicators for these metrics.
 
By the way, at the same time, I noticed that the RpcAuthenticationSuccesses readings gradually increase by +1 every 30 seconds
 
TEST 1
I made sure that
1. Only HDFS-{NN,DN,JN, ZKFC} and YARN-{RM,NM} services work
2. All other components that were – hive, spark HistoryServer, are disabled
3. There are no YARN jobs running and no user requests to hdfs
 
At the time of testing, the value of RpcAuthenticationFailures indicators = 0
RpcAuthenticationSuccesses = 208322
 
To check the download, I run the spark-submit test - spark-examples_2.12-3.5.0.jar with the number of performers = 1
The request was completed in 1 minute and 20 seconds
RpcAuthenticationSuccesses = 208338
 
In total, +16 was added to the original value during execution
Let's say +2 can be attributed to the moment I wrote about above +1 every 30 seconds. But what does +14 authentications mean?
 
TEST 2
RpcAuthenticationFailures = 0
RpcAuthenticationSuccesses = 208388
 
hdfs dfs -ls /
RpcAuthenticationFailures = 0
RpcAuthenticationSuccesses = 208389
Added +1. Why?
I started kinit long before the ls/request, i.e. the metric should not have changed, I think so, but maybe I'm wrong
 
TEST 3
 
disabled
- All DN are
- Satndby NN
- All YARN services (RM, NM)
 
still running
Three JN, ZKFC
One NN is active
 
The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric every 30 seconds
 
Either I misunderstand the meaning of these indicators, or something is considered wrong
Can you tell me how these indicators are calculated, I do not understand this or is it an error in the calculations and if I do not understand the work of these metrics, then how is it correct?
 
Thank you very much
 
--
With best wishes,
Anatoliy
 
 

Reply via email to