[ 
https://issues.apache.org/jira/browse/HDFS-17772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17944904#comment-17944904
 ] 

ASF GitHub Bot commented on HDFS-17772:
---------------------------------------

gyz-web opened a new pull request, #7617:
URL: https://github.com/apache/hadoop/pull/7617

   JIRA: [HDFS-17560](https://issues.apache.org/jira/browse/HDFS-17772).
   
   ### Description of PR
   When we use the RBF SBN READ in the production environment, we found the 
following issue.
   
   HDFS-16550 provides the parameter `dfs.journalnode.edit-cache-size.fraction` 
to control cache size based on journalnode memory ratio, but there is an issue 
of int overflow:
   
   When using the `dfs.journalnode.edit-cache-size.fraction` parameter to 
control cache capacity, during the initialization of the `capacity` in 
`org.apache.hadoop.hdfs.qjournal.server.JournaledEditsCache#JournaledEditsCache`,
 **a long-to-int overflow issue occurs**. For instance, when the heap memory is 
configured as 32GB (where `Runtime.getRuntime().maxMemory()` returns 
30,542,397,440 bytes), the overflow results in the `capacity` being truncated 
to `Integer.MAX_VALUE` (2,147,483,647). 
   
   This renders the parameter setting ineffective, as the intended proportional 
cache capacity cannot be achieved. To resolve this, the `capacity` should be 
declared as a `long` type, and the `totalSize` variable should also be 
converted to a `long` type to prevent overflow in scenarios where `capacity` 
exceeds 2,147,483,647, ensuring both variables can accurately represent large 
values without integer limitations.
   
   1. The error situation is as follows: 
       The dfs.Journalnode.edit-cache-size.fraction parameter uses the default 
value of 0.5f. I configured the heap memory size of Journalnode to 
30,542,397,440 bytes, and expected it to be 15,271,198,720 bytes, but the 
capacity size is always Integer.MAX_VALUE=2,147,483,647 bytes
   `2025-04-15 14:14:03,970 INFO  server.Journal 
(JournaledEditsCache.java:<init>(144)) - Enabling the journaled edits cache 
with a capacity of bytes: 2147483647`
   
   2. The repaired result is as follows,meet expectation:
   `2025-04-15 16:04:44,840 INFO  server.Journal 
(JournaledEditsCache.java:<init>(144)) - Enabling the journaled edits cache 
with a capacity of bytes: 15271198720`
   
   ### How was this patch tested?
   Since Runtime.getRuntime().maxMemory() cannot be adjusted in unit tests, it 
is not easy to write unit tests, but code changes are no problem for existing 
unit tests.
   
   




> The JournaledEditsCache has an int overflow issue, causing the maximum 
> capacity to always be Integer MAX_VALUE
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-17772
>                 URL: https://issues.apache.org/jira/browse/HDFS-17772
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.4.2
>            Reporter: Guo Wei
>            Priority: Minor
>             Fix For: 3.4.2
>
>
> When we use the RBF SBN READ in the production environment, we found the 
> following issue. 
> HDFS-16550 provides the parameter `dfs.journalnode.edit-cache-size.bytes` to 
> control cache size based on journalnode memory ratio, but there is an issue 
> of int overflow: 
> HDFS-16550 When using the `dfs.journalnode.edit-cache-size.bytes` parameter 
> to control cache capacity, during the initialization of the `capacity` in 
> `org.apache.hadoop.hdfs.qjournal.server.JournaledEditsCache#JournaledEditsCache`,
>  a long-to-int overflow issue occurs. For instance, when the heap memory is 
> configured as 32GB (where `Runtime.getRuntime().maxMemory()` returns 
> 30,542,397,440 bytes), the overflow results in the `capacity` being truncated 
> to `Integer.MAX_VALUE` (2,147,483,647). This renders the parameter setting 
> ineffective, as the intended proportional cache capacity cannot be achieved. 
> To resolve this, the `capacity` should be declared as a `long` type, and the 
> `totalSize` variable should also be converted to a `long` type to prevent 
> overflow in scenarios where `capacity` exceeds 2,147,483,647, ensuring both 
> variables can accurately represent large values without integer limitations.
> The error situation is as follows:
> {code:java}
> // code placeholder
> the dfs.Journalnode.edit-cache-size.fraction parameter uses the default value 
> of 0.5f. I configured the heap memory size of Journalnode to 30,542,397,440 
> bytes, and expected it to be 15,271,198,720 bytes, but the capacity size is 
> always Integer.MAX_VALUE=2,147,483,647 bytes
> 2025-04-15 14:14:03,970 INFO  server.Journal 
> (JournaledEditsCache.java:<init>(144)) - Enabling the journaled edits cache 
> with a capacity of bytes: 2147483647
> The repaired result is as follows,meet expectation:
> 2025-04-15 16:04:44,840 INFO  server.Journal 
> (JournaledEditsCache.java:<init>(144)) - Enabling the journaled edits cache 
> with a capacity of bytes: 15271198720 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to