[ 
https://issues.apache.org/jira/browse/HDFS-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18082625#comment-18082625
 ] 

ASF GitHub Bot commented on HDFS-17916:
---------------------------------------

hadoop-yetus commented on PR #8466:
URL: https://github.com/apache/hadoop/pull/8466#issuecomment-4509988895

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 54s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   1m 57s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  52m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 57s |  |  trunk passed with JDK 
Ubuntu-21.0.10+7-Ubuntu-124.04  |
   | +1 :green_heart: |  compile  |   6m 29s |  |  trunk passed with JDK 
Ubuntu-17.0.18+8-Ubuntu-124.04.1  |
   | +1 :green_heart: |  checkstyle  |   2m 14s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 33s |  |  trunk passed with JDK 
Ubuntu-21.0.10+7-Ubuntu-124.04  |
   | +1 :green_heart: |  javadoc  |   2m 35s |  |  trunk passed with JDK 
Ubuntu-17.0.18+8-Ubuntu-124.04.1  |
   | +1 :green_heart: |  spotbugs  |   8m 14s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  37m  3s |  |  branch has no errors 
when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 25s |  |  the patch passed with JDK 
Ubuntu-21.0.10+7-Ubuntu-124.04  |
   | +1 :green_heart: |  javac  |   5m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m  2s |  |  the patch passed with JDK 
Ubuntu-17.0.18+8-Ubuntu-124.04.1  |
   | +1 :green_heart: |  javac  |   6m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 45s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 27s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 36s |  |  the patch passed with JDK 
Ubuntu-21.0.10+7-Ubuntu-124.04  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK 
Ubuntu-17.0.18+8-Ubuntu-124.04.1  |
   | +1 :green_heart: |  spotbugs  |   7m 40s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  36m 56s |  |  patch has no errors 
when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 37s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 255m  0s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8466/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 447m 26s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsVolumeList |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.54 ServerAPI=1.54 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8466/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/8466 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 3447ae1728f0 5.15.0-174-generic #184-Ubuntu SMP Fri Mar 13 
18:41:50 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 853a790cb64dae36de186fda2ca3cb6f00f7e9d4 |
   | Default Java | Ubuntu-17.0.18+8-Ubuntu-124.04.1 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 
/usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8466/3/testReport/ |
   | Max. process+thread count | 2382 (vs. ulimit of 10000) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client 
hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8466/3/console |
   | versions | git=2.43.0 maven=3.9.15 spotbugs=4.9.7 |
   | Powered by | Apache Yetus 0.14.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> DataStreamer#processDatanodeOrExternalError() fails to return byte arrays to 
> ByteArrayManager
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-17916
>                 URL: https://issues.apache.org/jira/browse/HDFS-17916
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 3.3.6, 3.5.0, 3.4.3
>            Reporter: Charles Connell
>            Priority: Major
>              Labels: pull-request-available
>
> A [certain code 
> path|https://github.com/apache/hadoop/blob/b322c3ce2c10b45cec2f9acbe6f00fb75c054caa/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java#L1422]
>  in the DFS client DataStreamer discards DFSPacket objects without returning 
> their contained byte arrays to the ByteArrayManager. I discovered this bug at 
> my company after we had HBase server threads hung for hours at 
> {{{}ByteArrayManager#allocate(){}}}. Because the leak only happens in an 
> error-handling path, the problem requires an unhealthy HDFS cluster in order 
> to be exposed.
> I took a heap dump of a high-uptime but relatively healthy HBase server, and 
> found evidence of leaked byte arrays there too. In the heap dump, the two 
> FixedLengthManagers both had {{{}numAllocated = 9{}}}, but there were zero 
> live {{DFSPacket}} objects. This suggests that the byte arrays, and their 
> containing {{DFSPackets}} had been garbage collected, unbeknownst to 
> {{{}FixedLengthManager{}}}.
> In DataStreamer.java starting at line 1410, the {{DFSPacket}} that is 
> {{{}remove(){}}}'d from {{dataQueue}} is allowed to be garbage collected 
> without further interaction.
> {code:java}
>     if (!streamerClosed && dfsClient.clientRunning) {
>       if (stage == BlockConstructionStage.PIPELINE_CLOSE) {        // If we 
> had an error while closing the pipeline, we go through a fast-path
>         // where the BlockReceiver does not run. Instead, the DataNode just 
> finalizes
>         // the block immediately during the 'connect ack' process. So, we 
> want to pull
>         // the end-of-block packet from the dataQueue, since we don't 
> actually have
>         // a true pipeline to send it over.
>         //
>         // We also need to set lastAckedSeqno to the end-of-block Packet's 
> seqno, so that
>         // a client waiting on close() will be aware that the flush finished.
>         synchronized (dataQueue) {
>           DFSPacket endOfBlockPacket = dataQueue.remove();  // remove the end 
> of block packet
>           // Close any trace span associated with this Packet
>           Span span = endOfBlockPacket.getSpan();
>           if (span != null) {
>             span.finish();
>             endOfBlockPacket.setSpan(null);
>           }
>           assert endOfBlockPacket.isLastPacketInBlock();
>           assert lastAckedSeqno == endOfBlockPacket.getSeqno() - 1;
>           lastAckedSeqno = endOfBlockPacket.getSeqno();
>           pipelineRecoveryCount = 0;
>           dataQueue.notifyAll();
>         }
>         endBlock();
>       } else {
>         initDataStreaming();
>       }
>     } {code}
> This could be fixed by inserting this line somewhere above:
> {code:java}
> endOfBlockPacket.releaseBuffer(byteArrayManager);
> {code}
> Claude Opus 4.7 was used to assist in finding this bug. I verified the 
> findings and I stand by them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to