[
https://issues.apache.org/jira/browse/HDFS-17772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17944904#comment-17944904
]
ASF GitHub Bot commented on HDFS-17772:
---------------------------------------
gyz-web opened a new pull request, #7617:
URL: https://github.com/apache/hadoop/pull/7617
JIRA: [HDFS-17560](https://issues.apache.org/jira/browse/HDFS-17772).
### Description of PR
When we use the RBF SBN READ in the production environment, we found the
following issue.
HDFS-16550 provides the parameter `dfs.journalnode.edit-cache-size.fraction`
to control cache size based on journalnode memory ratio, but there is an issue
of int overflow:
When using the `dfs.journalnode.edit-cache-size.fraction` parameter to
control cache capacity, during the initialization of the `capacity` in
`org.apache.hadoop.hdfs.qjournal.server.JournaledEditsCache#JournaledEditsCache`,
**a long-to-int overflow issue occurs**. For instance, when the heap memory is
configured as 32GB (where `Runtime.getRuntime().maxMemory()` returns
30,542,397,440 bytes), the overflow results in the `capacity` being truncated
to `Integer.MAX_VALUE` (2,147,483,647).
This renders the parameter setting ineffective, as the intended proportional
cache capacity cannot be achieved. To resolve this, the `capacity` should be
declared as a `long` type, and the `totalSize` variable should also be
converted to a `long` type to prevent overflow in scenarios where `capacity`
exceeds 2,147,483,647, ensuring both variables can accurately represent large
values without integer limitations.
1. The error situation is as follows:
The dfs.Journalnode.edit-cache-size.fraction parameter uses the default
value of 0.5f. I configured the heap memory size of Journalnode to
30,542,397,440 bytes, and expected it to be 15,271,198,720 bytes, but the
capacity size is always Integer.MAX_VALUE=2,147,483,647 bytes
`2025-04-15 14:14:03,970 INFO server.Journal
(JournaledEditsCache.java:<init>(144)) - Enabling the journaled edits cache
with a capacity of bytes: 2147483647`
2. The repaired result is as follows,meet expectation:
`2025-04-15 16:04:44,840 INFO server.Journal
(JournaledEditsCache.java:<init>(144)) - Enabling the journaled edits cache
with a capacity of bytes: 15271198720`
### How was this patch tested?
Since Runtime.getRuntime().maxMemory() cannot be adjusted in unit tests, it
is not easy to write unit tests, but code changes are no problem for existing
unit tests.
> The JournaledEditsCache has an int overflow issue, causing the maximum
> capacity to always be Integer MAX_VALUE
> --------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-17772
> URL: https://issues.apache.org/jira/browse/HDFS-17772
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 3.4.2
> Reporter: Guo Wei
> Priority: Minor
> Fix For: 3.4.2
>
>
> When we use the RBF SBN READ in the production environment, we found the
> following issue.
> HDFS-16550 provides the parameter `dfs.journalnode.edit-cache-size.bytes` to
> control cache size based on journalnode memory ratio, but there is an issue
> of int overflow:
> HDFS-16550 When using the `dfs.journalnode.edit-cache-size.bytes` parameter
> to control cache capacity, during the initialization of the `capacity` in
> `org.apache.hadoop.hdfs.qjournal.server.JournaledEditsCache#JournaledEditsCache`,
> a long-to-int overflow issue occurs. For instance, when the heap memory is
> configured as 32GB (where `Runtime.getRuntime().maxMemory()` returns
> 30,542,397,440 bytes), the overflow results in the `capacity` being truncated
> to `Integer.MAX_VALUE` (2,147,483,647). This renders the parameter setting
> ineffective, as the intended proportional cache capacity cannot be achieved.
> To resolve this, the `capacity` should be declared as a `long` type, and the
> `totalSize` variable should also be converted to a `long` type to prevent
> overflow in scenarios where `capacity` exceeds 2,147,483,647, ensuring both
> variables can accurately represent large values without integer limitations.
> The error situation is as follows:
> {code:java}
> // code placeholder
> the dfs.Journalnode.edit-cache-size.fraction parameter uses the default value
> of 0.5f. I configured the heap memory size of Journalnode to 30,542,397,440
> bytes, and expected it to be 15,271,198,720 bytes, but the capacity size is
> always Integer.MAX_VALUE=2,147,483,647 bytes
> 2025-04-15 14:14:03,970 INFO server.Journal
> (JournaledEditsCache.java:<init>(144)) - Enabling the journaled edits cache
> with a capacity of bytes: 2147483647
> The repaired result is as follows,meet expectation:
> 2025-04-15 16:04:44,840 INFO server.Journal
> (JournaledEditsCache.java:<init>(144)) - Enabling the journaled edits cache
> with a capacity of bytes: 15271198720 {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]