Jiandan Yang created HADOOP-19447:
--------------------------------------
Summary: Add Caching Mechanism to HostResolver to Avoid Redundant
Hostname Resolutions
Key: HADOOP-19447
URL: https://issues.apache.org/jira/browse/HADOOP-19447
Project: Hadoop Common
Issue Type: New Feature
Components: common, yarn
Reporter: Jiandan Yang
*Background:*
Currently, *org.apache.hadoop.security.SecurityUtil.HostResolver* in Hadoop
performs hostname resolution each time it is called, leading to performance
overhead. *Each heartbeat between the AM and RM causes the RM to invoke the*
HostResolver#getByName \{*}method once{*}. In large-scale clusters running
numerous applications, this results in *a high frequency of redundant hostname
resolutions.*
*Proposal:*
Introduce a caching mechanism in HostResolver to store resolved hostnames for a
configurable duration. This would:
•Reduce redundant DNS queries.
•Improve performance for frequently used hostnames.
•Allow configuration options for cache size and TTL (Time-to-Live).
*Suggested Implementation:*
1.{*}Leverage Existing CachedResolver{*}:
The NodesListManager.CachedResolver class in Hadoop already implements a
caching mechanism for hostname resolution. Instead of introducing an entirely
new solution, we propose *extracting the caching logic from*
NodesListManager.CachedResolver \{*}into a separate reusable utility class{*}.
2.{*}Create a Shared Caching Utility{*}:
•Extract the caching logic from NodesListManager.CachedResolver.
•Implement a new class, e.g., HostnameCache, and place it in the Hadoop Common
module to ensure it can be used across different components.
3.{*}Integrate{*} HostnameCache \{*}with HostResolver & CachedResolver{*}:
•Modify HostResolver to use HostnameCache for hostname lookups.
•Update NodesListManager.CachedResolver to use HostnameCache instead of its own
internal cache.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]