[
https://issues.apache.org/jira/browse/HDFS-17860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ConfX updated HDFS-17860:
-------------------------
Description:
## Overview
A NullPointerException occurs in `BlocksMap.numNodes()` when calling
`getBlockLocations()` on a file that has been created via `concat()` operation,
after a NameNode restart. This is a critical production bug that could crash
the NameNode during normal file read operations.
---
## Reproduction
### Quick Start with Bash Script
**Use the provided `reproduce.sh` script to automatically reproduce this bug:**
This script automates the entire reproduction process by cloning Hadoop 3.3.5,
applying the test patch, building the project, and running the failing test
case.
```bash
cd bug1
./reproduce.sh
```
**What the script does:**
1. Clones the Hadoop repository (release 3.3.5 branch)
2. Applies the test patch (`restart.patch`) that adds the reproduction test
3. Builds the Hadoop HDFS module
4. Runs the test case `TestHDFSConcat#testConcatWithRestart` which demonstrates
the NullPointerException
The bug is confirmed if the test fails with a `NullPointerException` in
`BlocksMap.numNodes()`.
---
### Manual Reproduction Steps
If you prefer to run the test manually:
```bash
mvn surefire:test
-Dtest=TestHDFSConcat_RestartInjected#testConcat_AfterConcat_NN_Crash
```
### Test Scenario
1. Create a target file (`/trg`) with 3 blocks (512 bytes each)
2. Create 10 source files, each with 3 blocks
3. Call `dfs.concat(trgPath, files)` to concatenate all source files into target
4. Restart the NameNode
5. Call `nn.getBlockLocations(trg, 0, trgLen)` on the concatenated file
6. **NPE occurs** at `BlocksMap.numNodes()`
---
## Stack Trace
```
java.lang.NullPointerException
at
org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.numNodes(BlocksMap.java:172)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1420)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1382)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:1353)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:1503)
at
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:179)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2124)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:769)
```
---
## Root Cause Analysis
### Code Location
**File:** `BlocksMap.java:172`
```java
int numNodes(Block b) {
BlockInfo info = blocks.get(b); // LINE 172 - NPE HERE
return info == null ? 0 : info.numNodes();
}
```
### Why the NPE Occurs
The NPE happens because a **null** `Block` parameter `b` is being passed to
`numNodes()`. When `b` is null, calling `blocks.get(null)` throws a
`NullPointerException` because:
1. The `blocks` map is likely a `LightWeightGSet` or similar hash map
implementation
2. The `get()` method calls `b.hashCode()` for lookup
3. Calling `hashCode()` on null throws NPE
### How Null Blocks Enter the System
#### Debug Investigation
I added debug logging to `BlockManager.createLocatedBlockList()` to inspect the
blocks array:
```java
LOG.info("RESTART_DEBUG: createLocatedBlockList called with blocks.length=" +
blocks.length);
for (int i = 0; i < blocks.length; i++) {
if (blocks[i] == null) {
LOG.error("RESTART_DEBUG: blocks[" + i + "] is NULL!");
}
}
```
**Finding:** The blocks array itself does **NOT** contain null elements after
restart. All 33 blocks in the concatenated file's blocks array are non-null
BlockInfo objects.
#### Hypothesis: Stale Block References
After concat, the file's INode contains blocks from the original target file
PLUS all blocks from the concatenated source files. The test logs show:
- Before restart: Each source file has 3 blocks (10 files × 3 = 30 blocks)
- After concat: Target file should have 3 + 30 = 33 blocks
- After restart: `blocks.length=33` - all blocks present
However, based on the stack trace and NPE location, the issue likely stems from:
1. **Concat operation** moves blocks from source files to target file
2. Source files are deleted after concat
3. Block metadata for source files may be marked for deletion/invalidation
4. **After restart**, when NameNode reloads FSImage/EditLog:
- The concatenated file's INode correctly references all 33 blocks
- BUT some blocks may have been removed from the `BlocksMap` during fsimage
load
- Or blocks from deleted source files were never added to BlocksMap
5. When `createLocatedBlock()` is called with one of these "ghost" block
references:
- The BlockInfo object exists in the file's blocks array
- But the actual Block/BlockInfo lookup in BlocksMap fails or returns
inconsistent state
- A null block reference propagates to `numNodes()`
---
## Evidence
### Test Log Analysis
**Before Restart (all blocks non-null):**
```
2025-12-01 12:44:06,136 [Time-limited test] INFO BlockManager -
RESTART_DEBUG: createLocatedBlockList called with blocks.length=3, offset=0,
length=1536
blocks[0] = blk_1073741825_1001, numBytes=512
blocks[1] = blk_1073741826_1002, numBytes=512
blocks[2] = blk_1073741827_1003, numBytes=512
```
**After Restart (file now has 33 blocks from concat, all non-null):**
```
2025-12-01 12:44:15,034 [IPC Server handler 0] INFO BlockManager -
RESTART_DEBUG: createLocatedBlockList called with blocks.length=33, offset=0,
length=5120
blocks[0] = blk_1073741825_1001, numBytes=512
blocks[1] = blk_1073741826_1002, numBytes=512
blocks[2] = blk_1073741827_1003, numBytes=512
blocks[3] = blk_1073741855_1031, numBytes=512
... (all 33 blocks non-null)
blocks[32] = blk_1073741830_1006, numBytes=512
```
**Then NPE occurs** - suggesting the null block comes from a code path not
instrumented by my debug logging, or there's a race condition during BlocksMap
access.
---
## Likely Bug Location
### Suspect Area 1: Concat Implementation
The concat operation (in `FSNamesystem` or `FSDirectory`) may not properly
handle block ownership transfer during the transaction that gets persisted to
edit log. On restart:
- FSImage loading might not correctly restore all blocks to BlocksMap
- Blocks from deleted source files might be in a transitional state
### Suspect Area 2: BlockInfo Reference vs BlocksMap Inconsistency
There may be a race or ordering issue where:
1. INodeFile's blocks array references BlockInfo objects
2. These BlockInfo objects are not yet added to (or have been removed from) the
global BlocksMap
3. When `createLocatedBlock()` tries to look up block locations, it accesses a
BlockInfo that's not in the map
### Code Path to NPE
```
getBlockLocations()
→ createLocatedBlocks()
→ createLocatedBlockList()
→ createLocatedBlock(blocks[curBlk], ...) // blocks[curBlk] might be
problematic
→ createLocatedBlock(blk, ...)
→ blocksMap.numNodes(blk) // NPE if blk is somehow null or invalid
```
---
## Impact
1. **NameNode Crash Risk:** NPE in RPC handler can crash the NameNode during
client `getBlockLocations()` calls
2. **Data Availability:** Files created via `concat()` become unreadable after
NameNode restart
3. **Silent Corruption:** The concat operation appears to succeed, but the file
is broken after restart
### Affected Operations
- Any `getBlockLocations()` call on concatenated files after restart
- File reads (since clients call getBlockLocations)
- MapReduce/Spark jobs reading concatenated files
- Backup/replication tools accessing these files
---
## Recommended Fix
### Immediate Mitigation
Add null-safety check in `BlocksMap.numNodes()`:
```java
int numNodes(Block b) {
if (b == null) {
LOG.error("Null block passed to numNodes()!", new Exception("Stack trace"));
return 0; // Or throw IOException
}
BlockInfo info = blocks.get(b);
return info == null ? 0 : info.numNodes();
}
```
And in `BlockManager.createLocatedBlock()`:
```java
private LocatedBlock createLocatedBlock(LocatedBlockBuilder locatedBlocks,
final BlockInfo blk, final long pos, final AccessMode mode)
throws IOException {
if (blk == null) {
LOG.error("Null block in createLocatedBlock at pos=" + pos, new
Exception());
throw new IOException("Null block reference in file's block list");
}
// ... rest of method
}
```
### Root Cause Fix
**Requires deeper investigation:**
1. **Audit concat implementation** to ensure all blocks are properly:
- Added to target file's INode
- Registered in BlocksMap
- Persisted correctly in edit log
- Loaded correctly from FSImage/EditLog on restart
2. **Check FSImage/EditLog loading** for concat transactions:
- Verify blocks from concatenated files are added to BlocksMap
- Ensure proper ordering of operations during replay
- Check for race conditions in block map population
3. **Add consistency checks** during NameNode startup:
- Verify all blocks referenced by INodes exist in BlocksMap
- Log warnings for orphaned block references
- Option to auto-repair or fail-safe mode
I'm happy to discuss any potential root cause fix.
was:
## Overview
A NullPointerException occurs in `BlocksMap.numNodes()` when calling
`getBlockLocations()` on a file that has been created via `concat()` operation,
after a NameNode restart. This is a critical production bug that could crash
the NameNode during normal file read operations.
---
## Reproduction
### Quick Start with Bash Script
**Use the provided `reproduce.sh` script to automatically reproduce this bug:**
This script automates the entire reproduction process by cloning Hadoop 3.3.5,
applying the test patch, building the project, and running the failing test
case.
```bash
cd bug1
./reproduce.sh
```
**What the script does:**
1. Clones the Hadoop repository (release 3.3.5 branch)
2. Applies the test patch (`restart.patch`) that adds the reproduction test
3. Builds the Hadoop HDFS module
4. Runs the test case `TestHDFSConcat#testConcatWithRestart` which demonstrates
the NullPointerException
The bug is confirmed if the test fails with a `NullPointerException` in
`BlocksMap.numNodes()`.
---
### Manual Reproduction Steps
If you prefer to run the test manually:
```bash
mvn surefire:test
-Dtest=TestHDFSConcat_RestartInjected#testConcat_AfterConcat_NN_Crash
```
### Test Scenario
1. Create a target file (`/trg`) with 3 blocks (512 bytes each)
2. Create 10 source files, each with 3 blocks
3. Call `dfs.concat(trgPath, files)` to concatenate all source files into target
4. Restart the NameNode
5. Call `nn.getBlockLocations(trg, 0, trgLen)` on the concatenated file
6. **NPE occurs** at `BlocksMap.numNodes()`
---
## Stack Trace
```
java.lang.NullPointerException
at
org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.numNodes(BlocksMap.java:172)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1420)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1382)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:1353)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:1503)
at
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:179)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2124)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:769)
```
---
## Root Cause Analysis
### Code Location
**File:** `BlocksMap.java:172`
```java
int numNodes(Block b) {
BlockInfo info = blocks.get(b); // LINE 172 - NPE HERE
return info == null ? 0 : info.numNodes();
}
```
### Why the NPE Occurs
The NPE happens because a **null** `Block` parameter `b` is being passed to
`numNodes()`. When `b` is null, calling `blocks.get(null)` throws a
`NullPointerException` because:
1. The `blocks` map is likely a `LightWeightGSet` or similar hash map
implementation
2. The `get()` method calls `b.hashCode()` for lookup
3. Calling `hashCode()` on null throws NPE
### How Null Blocks Enter the System
#### Debug Investigation
I added debug logging to `BlockManager.createLocatedBlockList()` to inspect the
blocks array:
```java
LOG.info("RESTART_DEBUG: createLocatedBlockList called with blocks.length=" +
blocks.length);
for (int i = 0; i < blocks.length; i++) {
if (blocks[i] == null) {
LOG.error("RESTART_DEBUG: blocks[" + i + "] is NULL!");
}
}
```
**Finding:** The blocks array itself does **NOT** contain null elements after
restart. All 33 blocks in the concatenated file's blocks array are non-null
BlockInfo objects.
#### Hypothesis: Stale Block References
After concat, the file's INode contains blocks from the original target file
PLUS all blocks from the concatenated source files. The test logs show:
- Before restart: Each source file has 3 blocks (10 files × 3 = 30 blocks)
- After concat: Target file should have 3 + 30 = 33 blocks
- After restart: `blocks.length=33` - all blocks present
However, based on the stack trace and NPE location, the issue likely stems from:
1. **Concat operation** moves blocks from source files to target file
2. Source files are deleted after concat
3. Block metadata for source files may be marked for deletion/invalidation
4. **After restart**, when NameNode reloads FSImage/EditLog:
- The concatenated file's INode correctly references all 33 blocks
- BUT some blocks may have been removed from the `BlocksMap` during fsimage
load
- Or blocks from deleted source files were never added to BlocksMap
5. When `createLocatedBlock()` is called with one of these "ghost" block
references:
- The BlockInfo object exists in the file's blocks array
- But the actual Block/BlockInfo lookup in BlocksMap fails or returns
inconsistent state
- A null block reference propagates to `numNodes()`
---
## Evidence
### Test Log Analysis
**Before Restart (all blocks non-null):**
```
2025-12-01 12:44:06,136 [Time-limited test] INFO BlockManager -
RESTART_DEBUG: createLocatedBlockList called with blocks.length=3, offset=0,
length=1536
blocks[0] = blk_1073741825_1001, numBytes=512
blocks[1] = blk_1073741826_1002, numBytes=512
blocks[2] = blk_1073741827_1003, numBytes=512
```
**After Restart (file now has 33 blocks from concat, all non-null):**
```
2025-12-01 12:44:15,034 [IPC Server handler 0] INFO BlockManager -
RESTART_DEBUG: createLocatedBlockList called with blocks.length=33, offset=0,
length=5120
blocks[0] = blk_1073741825_1001, numBytes=512
blocks[1] = blk_1073741826_1002, numBytes=512
blocks[2] = blk_1073741827_1003, numBytes=512
blocks[3] = blk_1073741855_1031, numBytes=512
... (all 33 blocks non-null)
blocks[32] = blk_1073741830_1006, numBytes=512
```
**Then NPE occurs** - suggesting the null block comes from a code path not
instrumented by my debug logging, or there's a race condition during BlocksMap
access.
---
## Likely Bug Location
### Suspect Area 1: Concat Implementation
The concat operation (in `FSNamesystem` or `FSDirectory`) may not properly
handle block ownership transfer during the transaction that gets persisted to
edit log. On restart:
- FSImage loading might not correctly restore all blocks to BlocksMap
- Blocks from deleted source files might be in a transitional state
### Suspect Area 2: BlockInfo Reference vs BlocksMap Inconsistency
There may be a race or ordering issue where:
1. INodeFile's blocks array references BlockInfo objects
2. These BlockInfo objects are not yet added to (or have been removed from) the
global BlocksMap
3. When `createLocatedBlock()` tries to look up block locations, it accesses a
BlockInfo that's not in the map
### Code Path to NPE
```
getBlockLocations()
→ createLocatedBlocks()
→ createLocatedBlockList()
→ createLocatedBlock(blocks[curBlk], ...) // blocks[curBlk] might be
problematic
→ createLocatedBlock(blk, ...)
→ blocksMap.numNodes(blk) // NPE if blk is somehow null or invalid
```
---
## Impact
1. **NameNode Crash Risk:** NPE in RPC handler can crash the NameNode during
client `getBlockLocations()` calls
2. **Data Availability:** Files created via `concat()` become unreadable after
NameNode restart
3. **Silent Corruption:** The concat operation appears to succeed, but the file
is broken after restart
### Affected Operations
- Any `getBlockLocations()` call on concatenated files after restart
- File reads (since clients call getBlockLocations)
- MapReduce/Spark jobs reading concatenated files
- Backup/replication tools accessing these files
---
## Recommended Fix
### Immediate Mitigation
Add null-safety check in `BlocksMap.numNodes()`:
```java
int numNodes(Block b) {
if (b == null) {
LOG.error("Null block passed to numNodes()!", new Exception("Stack trace"));
return 0; // Or throw IOException
}
BlockInfo info = blocks.get(b);
return info == null ? 0 : info.numNodes();
}
```
And in `BlockManager.createLocatedBlock()`:
```java
private LocatedBlock createLocatedBlock(LocatedBlockBuilder locatedBlocks,
final BlockInfo blk, final long pos, final AccessMode mode)
throws IOException {
if (blk == null) {
LOG.error("Null block in createLocatedBlock at pos=" + pos, new
Exception());
throw new IOException("Null block reference in file's block list");
}
// ... rest of method
}
```
Please let me know if you think there could be a better fix and I'm happy to
discuss!
> NPE in BlocksMap After NameNode Concat and Restart
> --------------------------------------------------
>
> Key: HDFS-17860
> URL: https://issues.apache.org/jira/browse/HDFS-17860
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 3.3.5
> Reporter: ConfX
> Priority: Major
> Attachments: reproduce.sh, restart.patch
>
>
> ## Overview
>
> A NullPointerException occurs in `BlocksMap.numNodes()` when calling
> `getBlockLocations()` on a file that has been created via `concat()`
> operation, after a NameNode restart. This is a critical production bug that
> could crash the NameNode during normal file read operations.
>
> ---
>
> ## Reproduction
>
> ### Quick Start with Bash Script
>
> **Use the provided `reproduce.sh` script to automatically reproduce this
> bug:**
>
> This script automates the entire reproduction process by cloning Hadoop
> 3.3.5, applying the test patch, building the project, and running the failing
> test case.
>
> ```bash
> cd bug1
> ./reproduce.sh
> ```
>
> **What the script does:**
> 1. Clones the Hadoop repository (release 3.3.5 branch)
> 2. Applies the test patch (`restart.patch`) that adds the reproduction test
> 3. Builds the Hadoop HDFS module
> 4. Runs the test case `TestHDFSConcat#testConcatWithRestart` which
> demonstrates the NullPointerException
>
> The bug is confirmed if the test fails with a `NullPointerException` in
> `BlocksMap.numNodes()`.
>
> ---
>
> ### Manual Reproduction Steps
>
> If you prefer to run the test manually:
>
> ```bash
> mvn surefire:test
> -Dtest=TestHDFSConcat_RestartInjected#testConcat_AfterConcat_NN_Crash
> ```
>
> ### Test Scenario
> 1. Create a target file (`/trg`) with 3 blocks (512 bytes each)
> 2. Create 10 source files, each with 3 blocks
> 3. Call `dfs.concat(trgPath, files)` to concatenate all source files into
> target
> 4. Restart the NameNode
> 5. Call `nn.getBlockLocations(trg, 0, trgLen)` on the concatenated file
> 6. **NPE occurs** at `BlocksMap.numNodes()`
>
> ---
>
> ## Stack Trace
>
> ```
> java.lang.NullPointerException
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.numNodes(BlocksMap.java:172)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1420)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1382)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:1353)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:1503)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:179)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2124)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:769)
> ```
>
> ---
>
> ## Root Cause Analysis
>
> ### Code Location
> **File:** `BlocksMap.java:172`
>
> ```java
> int numNodes(Block b) {
> BlockInfo info = blocks.get(b); // LINE 172 - NPE HERE
> return info == null ? 0 : info.numNodes();
> }
> ```
>
> ### Why the NPE Occurs
>
> The NPE happens because a **null** `Block` parameter `b` is being passed to
> `numNodes()`. When `b` is null, calling `blocks.get(null)` throws a
> `NullPointerException` because:
> 1. The `blocks` map is likely a `LightWeightGSet` or similar hash map
> implementation
> 2. The `get()` method calls `b.hashCode()` for lookup
> 3. Calling `hashCode()` on null throws NPE
>
> ### How Null Blocks Enter the System
>
> #### Debug Investigation
> I added debug logging to `BlockManager.createLocatedBlockList()` to inspect
> the blocks array:
>
> ```java
> LOG.info("RESTART_DEBUG: createLocatedBlockList called with blocks.length=" +
> blocks.length);
> for (int i = 0; i < blocks.length; i++) {
> if (blocks[i] == null) {
> LOG.error("RESTART_DEBUG: blocks[" + i + "] is NULL!");
> }
> }
> ```
>
> **Finding:** The blocks array itself does **NOT** contain null elements after
> restart. All 33 blocks in the concatenated file's blocks array are non-null
> BlockInfo objects.
>
> #### Hypothesis: Stale Block References
>
> After concat, the file's INode contains blocks from the original target file
> PLUS all blocks from the concatenated source files. The test logs show:
> - Before restart: Each source file has 3 blocks (10 files × 3 = 30 blocks)
> - After concat: Target file should have 3 + 30 = 33 blocks
> - After restart: `blocks.length=33` - all blocks present
>
> However, based on the stack trace and NPE location, the issue likely stems
> from:
>
> 1. **Concat operation** moves blocks from source files to target file
> 2. Source files are deleted after concat
> 3. Block metadata for source files may be marked for deletion/invalidation
> 4. **After restart**, when NameNode reloads FSImage/EditLog:
> - The concatenated file's INode correctly references all 33 blocks
> - BUT some blocks may have been removed from the `BlocksMap` during
> fsimage load
> - Or blocks from deleted source files were never added to BlocksMap
> 5. When `createLocatedBlock()` is called with one of these "ghost" block
> references:
> - The BlockInfo object exists in the file's blocks array
> - But the actual Block/BlockInfo lookup in BlocksMap fails or returns
> inconsistent state
> - A null block reference propagates to `numNodes()`
>
> ---
>
> ## Evidence
>
> ### Test Log Analysis
>
> **Before Restart (all blocks non-null):**
> ```
> 2025-12-01 12:44:06,136 [Time-limited test] INFO BlockManager -
> RESTART_DEBUG: createLocatedBlockList called with blocks.length=3,
> offset=0, length=1536
> blocks[0] = blk_1073741825_1001, numBytes=512
> blocks[1] = blk_1073741826_1002, numBytes=512
> blocks[2] = blk_1073741827_1003, numBytes=512
> ```
>
> **After Restart (file now has 33 blocks from concat, all non-null):**
> ```
> 2025-12-01 12:44:15,034 [IPC Server handler 0] INFO BlockManager -
> RESTART_DEBUG: createLocatedBlockList called with blocks.length=33,
> offset=0, length=5120
> blocks[0] = blk_1073741825_1001, numBytes=512
> blocks[1] = blk_1073741826_1002, numBytes=512
> blocks[2] = blk_1073741827_1003, numBytes=512
> blocks[3] = blk_1073741855_1031, numBytes=512
> ... (all 33 blocks non-null)
> blocks[32] = blk_1073741830_1006, numBytes=512
> ```
>
> **Then NPE occurs** - suggesting the null block comes from a code path not
> instrumented by my debug logging, or there's a race condition during
> BlocksMap access.
>
> ---
>
> ## Likely Bug Location
>
> ### Suspect Area 1: Concat Implementation
>
> The concat operation (in `FSNamesystem` or `FSDirectory`) may not properly
> handle block ownership transfer during the transaction that gets persisted to
> edit log. On restart:
> - FSImage loading might not correctly restore all blocks to BlocksMap
> - Blocks from deleted source files might be in a transitional state
>
> ### Suspect Area 2: BlockInfo Reference vs BlocksMap Inconsistency
>
> There may be a race or ordering issue where:
> 1. INodeFile's blocks array references BlockInfo objects
> 2. These BlockInfo objects are not yet added to (or have been removed from)
> the global BlocksMap
> 3. When `createLocatedBlock()` tries to look up block locations, it accesses
> a BlockInfo that's not in the map
>
> ### Code Path to NPE
>
> ```
> getBlockLocations()
> → createLocatedBlocks()
> → createLocatedBlockList()
> → createLocatedBlock(blocks[curBlk], ...) // blocks[curBlk] might be
> problematic
> → createLocatedBlock(blk, ...)
> → blocksMap.numNodes(blk) // NPE if blk is somehow null or invalid
> ```
>
> ---
>
> ## Impact
>
> 1. **NameNode Crash Risk:** NPE in RPC handler can crash the NameNode during
> client `getBlockLocations()` calls
> 2. **Data Availability:** Files created via `concat()` become unreadable
> after NameNode restart
> 3. **Silent Corruption:** The concat operation appears to succeed, but the
> file is broken after restart
>
> ### Affected Operations
>
> - Any `getBlockLocations()` call on concatenated files after restart
> - File reads (since clients call getBlockLocations)
> - MapReduce/Spark jobs reading concatenated files
> - Backup/replication tools accessing these files
>
> ---
>
> ## Recommended Fix
>
> ### Immediate Mitigation
> Add null-safety check in `BlocksMap.numNodes()`:
>
> ```java
> int numNodes(Block b) {
> if (b == null) {
> LOG.error("Null block passed to numNodes()!", new Exception("Stack
> trace"));
> return 0; // Or throw IOException
> }
> BlockInfo info = blocks.get(b);
> return info == null ? 0 : info.numNodes();
> }
> ```
>
> And in `BlockManager.createLocatedBlock()`:
>
> ```java
> private LocatedBlock createLocatedBlock(LocatedBlockBuilder locatedBlocks,
> final BlockInfo blk, final long pos, final AccessMode mode)
> throws IOException {
> if (blk == null) {
> LOG.error("Null block in createLocatedBlock at pos=" + pos, new
> Exception());
> throw new IOException("Null block reference in file's block list");
> }
> // ... rest of method
> }
> ```
>
> ### Root Cause Fix
>
> **Requires deeper investigation:**
>
> 1. **Audit concat implementation** to ensure all blocks are properly:
> - Added to target file's INode
> - Registered in BlocksMap
> - Persisted correctly in edit log
> - Loaded correctly from FSImage/EditLog on restart
>
> 2. **Check FSImage/EditLog loading** for concat transactions:
> - Verify blocks from concatenated files are added to BlocksMap
> - Ensure proper ordering of operations during replay
> - Check for race conditions in block map population
>
> 3. **Add consistency checks** during NameNode startup:
> - Verify all blocks referenced by INodes exist in BlocksMap
> - Log warnings for orphaned block references
> - Option to auto-repair or fail-safe mode
>
> I'm happy to discuss any potential root cause fix.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]