Re: [PR] HDFS-17497. The number of bytes of the last committed block should be calculated into the file length [hadoop]

2024-08-21 Thread via GitHub


hfutatzhanghb commented on code in PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#discussion_r1726157523


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java:
##
@@ -1105,15 +1106,12 @@ static void unprotectedUpdateCount(INodesInPath 
inodesInPath,
   /**
* Update the cached quota space for a block that is being completed.
* Must only be called once, as the block is being completed.
-   * @param completeBlk - Completed block for which to update space
-   * @param inodes - INodes in path to file containing completeBlk; if null
-   * this will be resolved internally
+   * @param commitBlock - Committed block for which to update space
+   * @param iip - INodes in path to file containing committedBlock
*/
-  public void updateSpaceForCompleteBlock(BlockInfo completeBlk,
-  INodesInPath inodes) throws IOException {
+  public void updateSpaceForCommittedBlock(Block commitBlock,
+  INodesInPath iip) throws IOException {
 assert namesystem.hasWriteLock();
-INodesInPath iip = inodes != null ? inodes :
-INodesInPath.fromINode(namesystem.getBlockCollection(completeBlk));
 INodeFile fileINode = iip.getLastINode().asFile();

Review Comment:
   In original logic, it will consider whether iip is null or not.  But we miss 
judgement here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HDFS-17497. The number of bytes of the last committed block should be calculated into the file length [hadoop]

2024-05-13 Thread via GitHub


haiyang1987 commented on code in PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#discussion_r1599287673


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java:
##
@@ -387,6 +387,19 @@ public boolean isUnderRecovery() {
 return getBlockUCState().equals(BlockUCState.UNDER_RECOVERY);
   }
 
+  /**
+   * Is this block still under construction or recoery.

Review Comment:
Leave some small comment.
   
   `recoery` update to `recovery`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HDFS-17497. The number of bytes of the last committed block should be calculated into the file length [hadoop]

2024-05-11 Thread via GitHub


hfutatzhanghb commented on code in PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#discussion_r1597392676


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java:
##
@@ -1105,15 +1106,12 @@ static void unprotectedUpdateCount(INodesInPath 
inodesInPath,
   /**
* Update the cached quota space for a block that is being completed.
* Must only be called once, as the block is being completed.
-   * @param completeBlk - Completed block for which to update space
-   * @param inodes - INodes in path to file containing completeBlk; if null
-   * this will be resolved internally
+   * @param commitBlock - Committed block for which to update space
+   * @param iip - INodes in path to file containing committedBlock
*/
-  public void updateSpaceForCompleteBlock(BlockInfo completeBlk,
-  INodesInPath inodes) throws IOException {
+  public void updateSpaceForCommittedBlock(Block commitBlock,

Review Comment:
   Which could be better parameter name : commitBlock or committedBlock?



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -3887,7 +3887,11 @@ void commitOrCompleteLastBlock(
   final Block commitBlock) throws IOException {
 assert hasWriteLock();
 Preconditions.checkArgument(fileINode.isUnderConstruction());
-blockManager.commitOrCompleteLastBlock(fileINode, commitBlock, iip);
+if (!blockManager.commitOrCompleteLastBlock(fileINode, commitBlock)) {
+  return;
+}
+// Updating QuotaUsage when committing block since block size will not be 
changed
+getFSDirectory().updateSpaceForCommittedBlock(commitBlock, iip);

Review Comment:
   Sir, How about
   ```java
   if (blockManager.commitOrCompleteLastBlock(fileINode, commitBlock)) {
   // Updating QuotaUsage when committing block since block size will 
not be changed
   getFSDirectory().updateSpaceForCommittedBlock(commitBlock, iip);
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HDFS-17497. The number of bytes of the last committed block should be calculated into the file length [hadoop]

2024-04-28 Thread via GitHub


ZanderXu commented on PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#issuecomment-2081766984

   > IIRC, client also check the file length through request DataNode which 
manage the uncomplete block?
   
   Like other committed blocks, the client does not need to get the visible 
length from DN if the last block is in committed state.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HDFS-17497. The number of bytes of the last committed block should be calculated into the file length [hadoop]

2024-04-28 Thread via GitHub


hadoop-yetus commented on PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#issuecomment-2081636005

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 01s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  86m 48s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 57s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   4m 39s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   6m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   5m 50s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 143m 08s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 37s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 27s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   3m 27s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 01s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   2m 19s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   4m 00s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 25s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 154m 45s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   5m 13s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 412m 04s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6765 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 15fffafc3582 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 0aa96155ae7aed9c69d8c0ede601fffd4bc8c17f |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6765/3/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6765/3/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HDFS-17497. The number of bytes of the last committed block should be calculated into the file length [hadoop]

2024-04-28 Thread via GitHub


Hexiaoqiao commented on PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#issuecomment-2081489723

   Great catch, not review carefully, but I remember this have been discussed 
for long time. IIRC, client also check the file length through request DataNode 
which manage the uncomplete block? Thanks.
   (will try to review PR later.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HDFS-17497. The number of bytes of the last committed block should be calculated into the file length [hadoop]

2024-04-24 Thread via GitHub


hadoop-yetus commented on PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#issuecomment-2074475186

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  50m 31s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  41m 29s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  3s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 290 unchanged - 1 
fixed = 290 total (was 291)  |
   | +1 :green_heart: |  mvnsite  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  40m 48s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 269m  3s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 52s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 427m 31s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6765/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6765 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 3ed817d3780c 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0aa96155ae7aed9c69d8c0ede601fffd4bc8c17f |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6765/2/testReport/ |
   | Max. process+thread count | 2789 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6765/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 

Re: [PR] HDFS-17497. The number of bytes of the last committed block should be calculated into the file length [hadoop]

2024-04-23 Thread via GitHub


ZanderXu commented on PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#issuecomment-2073863664

   @xkrogen master, I very much hope you can review this PR when you are 
available, since you are familiar with ~~HDFS-10843~~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HDFS-17497. The number of bytes of the last committed block should be calculated into the file length [hadoop]

2024-04-23 Thread via GitHub


hadoop-yetus commented on PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#issuecomment-2073379567

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  50m 48s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  40m 56s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m  6s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m  2s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6765/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 4 new + 290 unchanged 
- 1 fixed = 294 total (was 291)  |
   | +1 :green_heart: |  mvnsite  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  40m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 268m 19s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6765/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 41s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 426m 24s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestLargeBlockReport |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6765/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6765 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux be83e02b2463 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0c54e4a633f06fcaceb4781e63f06bb8a76b0fc8 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6765/1/testReport/ |
   | Max. process+thread count | 3165 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hado

[PR] HDFS-17497. The number of bytes of the last committed block should be calculated into the file length [hadoop]

2024-04-23 Thread via GitHub


ZanderXu opened a new pull request, #6765:
URL: https://github.com/apache/hadoop/pull/6765

   One in-writing HDFS file may contains multiple committed blocks, as 
follows (assume one file contains three blocks):
   
   
     | Block 1 | Block 2 | Block 3
   -- | -- | -- | --
   Case 1 | Complete | Commit | UnderConstruction
   Case 2 | Complete | Commit | Commit
   Case 3 | Commit | Commit | Commit
   
   
   
    
   But the logic for committed blocks is mixed when computing file size, it 
ignores the bytes of the last committed block and contains the bytes of other 
committed blocks.
   public final long 
computeFileSize(boolean includesLastUcBlock,
   boolean 
usePreferredBlockSize4LastUcBlock) {
 if (blocks.length == 0) {
   return 0;
 }
 final int last = blocks.length - 1;
 //check if 
the last block is BlockInfoUnderConstruction
 BlockInfo lastBlk = blocks[last];
 long size = lastBlk.getNumBytes();
 // the last committed block is not complete, so 
it's bytes may be ignored.
 if (!lastBlk.isComplete()) {
if (!includesLastUcBlock) {
  size = 0;
} else if (usePreferredBlockSize4LastUcBlock) {
  size = isStriped()?
  getPreferredBlockSize() *
  ((BlockInfoStriped)lastBlk).getDataBlockNum() :
  getPreferredBlockSize();
}
 }
 // The bytes of other committed blocks are 
calculated into the file length.
 for (int i = 0; i < last; i++) {
   size += blocks[i].getNumBytes();
 }
 return size;
   } 
   The bytes of one committed block will not be changed, so the bytes of the 
last committed block should be calculated into the file length too.
    
   And the logic for committed blocks is mixed too when computing file 
length in DFSInputStream. Normally DFSInputStream doesn't get visible length 
for committed block regardless of whether the committed block is the last block 
or not.
    
   https://issues.apache.org/jira/browse/HDFS-10843"; 
data-issue-key="HDFS-10843">HDFS-10843 noticed one bug which 
actually caused by the committed block, but https://issues.apache.org/jira/browse/HDFS-10843"; 
data-issue-key="HDFS-10843">HDFS-10843 fixed that bug in another 
way.
   The num of bytes of the committed block will no longer change, so we 
should update the quota usage when the block is committed, which can reduce the 
delta quota usage in time.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org