[
https://issues.apache.org/jira/browse/HADOOP-19847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18066928#comment-18066928
]
ASF GitHub Bot commented on HADOOP-19847:
-----------------------------------------
CapMoon opened a new pull request, #8359:
URL: https://github.com/apache/hadoop/pull/8359
HADOOP-19847. Move logAllocatedBlock out of lock in
FSNamesystem.getAdditionalBlock to reduce latency
<!--
Thanks for sending a pull request!
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
2. Make sure your PR title starts with JIRA issue id, e.g.,
'HADOOP-17799. Your PR title ...'.
-->
### Description of PR
The logAllocatedBlock method in FSNamesystem.getAdditionalBlock is currently
called while holding global lock. Flame graph analysis shows this logging path
(via SLF4J/Log4j appenders) contributes non-trivial latency, blocking other
NameNode operations.
Since logAllocatedBlock is only for audit/diagnostic logging and does not
modify shared state, we can safely move it after releasing global lock to
reduce lock hold time and improve write throughput.
This change preserves all existing logging behavior while eliminating
unnecessary lock contention from I/O-bound logging operations.
### How was this patch tested?
N/A
<img width="3454" height="1206" alt="logAllocatedBlock takes lots of time"
src="https://github.com/user-attachments/assets/4173630c-6c81-4dbe-9de1-82e3dfadcd46"
/>
> Move logAllocatedBlock out of lock in FSNamesystem.getAdditionalBlock to
> reduce latency
> ---------------------------------------------------------------------------------------
>
> Key: HADOOP-19847
> URL: https://issues.apache.org/jira/browse/HADOOP-19847
> Project: Hadoop Common
> Issue Type: Improvement
> Components: hdfs
> Affects Versions: 3.4.3
> Reporter: yue.wang
> Priority: Major
> Labels: HDFS
> Attachments: logAllocatedBlock takes lots of time.png
>
>
> The {{logAllocatedBlock}} method in {{FSNamesystem.getAdditionalBlock}} is
> currently called while holding global lock. Flame graph analysis shows this
> logging path (via SLF4J/Log4j appenders) contributes non-trivial latency,
> blocking other NameNode operations.
>
> Since {{logAllocatedBlock}} is only for audit/diagnostic logging and does not
> modify shared state, we can safely move it after releasing global lock to
> reduce lock hold time and improve write throughput.
>
> This change preserves all existing logging behavior while eliminating
> unnecessary lock contention from I/O-bound logging operations.
>
> Flame graph:
> !logAllocatedBlock takes lots of time.png|width=1083,height=378!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]