[ 
https://issues.apache.org/jira/browse/HDFS-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ConfX updated HDFS-17870:
-------------------------
    Description: 
h2. Summary

`MiniDFSCluster.restartDataNode()` throws `ArrayIndexOutOfBoundsException` when 
attempting to restart a DataNode that was added to the cluster without 
specifying storage capacities.
 
h2. Description

When a DataNode is added to a MiniDFSCluster using `startDataNodes()` with 
`storageCapacities = null`, and later that DataNode is restarted using 
`restartDataNode()`, an `ArrayIndexOutOfBoundsException` is thrown.

The root cause is that `MiniDFSCluster.setDataNodeStorageCapacities()` does not 
validate that the DataNode index is within the bounds of the 
`storageCapacities` array before accessing it.
 
h2. Steps to Reproduce

1. Create a MiniDFSCluster with explicit storage capacities:
{code:java}
   cluster = new MiniDFSCluster.Builder(conf)
       .numDataNodes(1)
       .storageCapacities(new long[] { CAPACITY })
       .build(); {code}
2. Add another DataNode without specifying storage capacities:
{code:java}
   cluster.startDataNodes(conf, 1, true, null, null);  // storageCapacities is 
null {code}
3. Restart the second DataNode:
{code:java}
 cluster.restartDataNode(1);  // Throws ArrayIndexOutOfBoundsException {code}
h2. Expected Behavior

`restartDataNode()` should successfully restart the DataNode regardless of 
whether storage capacities were specified when the DataNode was originally 
added.
h2. Actual Behavior
{code:java}
java.lang.ArrayIndexOutOfBoundsException: 1
    at 
org.apache.hadoop.hdfs.MiniDFSCluster.setDataNodeStorageCapacities(MiniDFSCluster.java:1882)
    at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2557)
    at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2596)
    at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2576) 
{code}
h2. Proposed Fix

Add a bounds check in `setDataNodeStorageCapacities()` to handle the case where 
the DataNode index exceeds the storage capacities array length.
{code:java}
   private synchronized void setDataNodeStorageCapacities(
       final int curDnIdx,
       final DataNode curDn,
       long[][] storageCapacities) throws IOException {
-    if (storageCapacities == null || storageCapacities.length == 0) {
+    // Check for null/empty array AND ensure index is within bounds.
+    // DataNodes added without explicit storageCapacities won't have
+    // an entry in the storageCap list.
+    if (storageCapacities == null || storageCapacities.length == 0
+        || curDnIdx >= storageCapacities.length) {
       return; {code}
I'm happy to send a PR for this issue.

  was:
h2. Summary
`MiniDFSCluster.restartDataNode()` throws `ArrayIndexOutOfBoundsException` when 
attempting to restart a DataNode that was added to the cluster without 
specifying storage capacities.
 
h2. Description
When a DataNode is added to a MiniDFSCluster using `startDataNodes()` with 
`storageCapacities = null`, and later that DataNode is restarted using 
`restartDataNode()`, an `ArrayIndexOutOfBoundsException` is thrown.

The root cause is that `MiniDFSCluster.setDataNodeStorageCapacities()` does not 
validate that the DataNode index is within the bounds of the 
`storageCapacities` array before accessing it.
 
h2. Steps to Reproduce
1. Create a MiniDFSCluster with explicit storage capacities:
{code:java}
   cluster = new MiniDFSCluster.Builder(conf)
       .numDataNodes(1)
       .storageCapacities(new long[] { CAPACITY })
       .build(); {code}
2. Add another DataNode without specifying storage capacities:
{code:java}
   cluster.startDataNodes(conf, 1, true, null, null);  // storageCapacities is 
null {code}
3. Restart the second DataNode:
{code:java}
 cluster.restartDataNode(1);  // Throws ArrayIndexOutOfBoundsException {code}
h2. Expected Behavior
`restartDataNode()` should successfully restart the DataNode regardless of 
whether storage capacities were specified when the DataNode was originally 
added.
h2. Actual Behavior
{code:java}
java.lang.ArrayIndexOutOfBoundsException: 1
    at 
org.apache.hadoop.hdfs.MiniDFSCluster.setDataNodeStorageCapacities(MiniDFSCluster.java:1882)
    at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2557)
    at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2596)
    at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2576) 
{code}
h2. Proposed Fix
Add a bounds check in `setDataNodeStorageCapacities()` to handle the case where 
the DataNode index exceeds the storage capacities array length.
{code:java}
   private synchronized void setDataNodeStorageCapacities(
       final int curDnIdx,
       final DataNode curDn,
       long[][] storageCapacities) throws IOException {-    if 
(storageCapacities == null || storageCapacities.length == 0) {
+    // Check for null/empty array AND ensure index is within bounds.
+    // DataNodes added without explicit storageCapacities won't have
+    // an entry in the storageCap list.
+    if (storageCapacities == null || storageCapacities.length == 0
+        || curDnIdx >= storageCapacities.length) {
       return; {code}
I'm happy to send a PR for this issue.


> ArrayIndexOutOfBoundsException when DataNode was restarted without 
> storageCapacities
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-17870
>                 URL: https://issues.apache.org/jira/browse/HDFS-17870
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: test
>            Reporter: ConfX
>            Priority: Major
>
> h2. Summary
> `MiniDFSCluster.restartDataNode()` throws `ArrayIndexOutOfBoundsException` 
> when attempting to restart a DataNode that was added to the cluster without 
> specifying storage capacities.
>  
> h2. Description
> When a DataNode is added to a MiniDFSCluster using `startDataNodes()` with 
> `storageCapacities = null`, and later that DataNode is restarted using 
> `restartDataNode()`, an `ArrayIndexOutOfBoundsException` is thrown.
> The root cause is that `MiniDFSCluster.setDataNodeStorageCapacities()` does 
> not validate that the DataNode index is within the bounds of the 
> `storageCapacities` array before accessing it.
>  
> h2. Steps to Reproduce
> 1. Create a MiniDFSCluster with explicit storage capacities:
> {code:java}
>    cluster = new MiniDFSCluster.Builder(conf)
>        .numDataNodes(1)
>        .storageCapacities(new long[] { CAPACITY })
>        .build(); {code}
> 2. Add another DataNode without specifying storage capacities:
> {code:java}
>    cluster.startDataNodes(conf, 1, true, null, null);  // storageCapacities 
> is null {code}
> 3. Restart the second DataNode:
> {code:java}
>  cluster.restartDataNode(1);  // Throws ArrayIndexOutOfBoundsException {code}
> h2. Expected Behavior
> `restartDataNode()` should successfully restart the DataNode regardless of 
> whether storage capacities were specified when the DataNode was originally 
> added.
> h2. Actual Behavior
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: 1
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.setDataNodeStorageCapacities(MiniDFSCluster.java:1882)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2557)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2596)
>     at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2576)
>  {code}
> h2. Proposed Fix
> Add a bounds check in `setDataNodeStorageCapacities()` to handle the case 
> where the DataNode index exceeds the storage capacities array length.
> {code:java}
>    private synchronized void setDataNodeStorageCapacities(
>        final int curDnIdx,
>        final DataNode curDn,
>        long[][] storageCapacities) throws IOException {
> -    if (storageCapacities == null || storageCapacities.length == 0) {
> +    // Check for null/empty array AND ensure index is within bounds.
> +    // DataNodes added without explicit storageCapacities won't have
> +    // an entry in the storageCap list.
> +    if (storageCapacities == null || storageCapacities.length == 0
> +        || curDnIdx >= storageCapacities.length) {
>        return; {code}
> I'm happy to send a PR for this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to