[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.
[ https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768594#comment-17768594 ] ASF GitHub Bot commented on HADOOP-18873: - mehakmeet merged PR #6105: URL: https://github.com/apache/hadoop/pull/6105 > ABFS: AbfsOutputStream doesnt close DataBlocks object. > -- > > Key: HADOOP-18873 > URL: https://issues.apache.org/jira/browse/HADOOP-18873 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.4 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.3.4 > > > AbfsOutputStream doesnt close the dataBlock object created for the upload. > What is the implication of not doing that: > DataBlocks has three implementations: > # ByteArrayBlock > ## This creates an object of DataBlockByteArrayOutputStream (child of > ByteArrayOutputStream: wrapper arround byte-arrray for populating, reading > the array. > ## This gets GCed. > # ByteBufferBlock: > ## There is a defined *DirectBufferPool* from which it tries to request the > directBuffer. > ## If nothing in the pool, a new directBuffer is created. > ## the `close` method on the this object has the responsiblity of returning > back the buffer to pool so it can be reused. > ## Since we are not calling the `close`: > ### The pool is rendered of less use, since each request creates a new > directBuffer from memory. > ### All the object can be GCed and the direct-memory allocated may be > returned on the GC. What if the process crashes, the memory never goes back > and cause memory issue on the machine. > # DiskBlock: > ## This creates a file on disk on which the data-to-upload is written. This > file gets deleted in startUpload().close(). > > startUpload() gives an object of BlockUploadData which gives method of > `toByteArray()` which is used in abfsOutputStream to get the byteArray in the > dataBlock. > > Method which uses the DataBlock object: > https://github.com/apache/hadoop/blob/fac7d26c5d7f791565cc3ab45d079e2cca725f95/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java#L298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.
[ https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767471#comment-17767471 ] ASF GitHub Bot commented on HADOOP-18873: - saxenapranav commented on PR #6010: URL: https://github.com/apache/hadoop/pull/6010#issuecomment-1729142543 Thanks @mehakmeet for the review. I have cherry-picked the same in branch-3.3 in PR: https://github.com/apache/hadoop/pull/6105. Thanks. > ABFS: AbfsOutputStream doesnt close DataBlocks object. > -- > > Key: HADOOP-18873 > URL: https://issues.apache.org/jira/browse/HADOOP-18873 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.4 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.3.4 > > > AbfsOutputStream doesnt close the dataBlock object created for the upload. > What is the implication of not doing that: > DataBlocks has three implementations: > # ByteArrayBlock > ## This creates an object of DataBlockByteArrayOutputStream (child of > ByteArrayOutputStream: wrapper arround byte-arrray for populating, reading > the array. > ## This gets GCed. > # ByteBufferBlock: > ## There is a defined *DirectBufferPool* from which it tries to request the > directBuffer. > ## If nothing in the pool, a new directBuffer is created. > ## the `close` method on the this object has the responsiblity of returning > back the buffer to pool so it can be reused. > ## Since we are not calling the `close`: > ### The pool is rendered of less use, since each request creates a new > directBuffer from memory. > ### All the object can be GCed and the direct-memory allocated may be > returned on the GC. What if the process crashes, the memory never goes back > and cause memory issue on the machine. > # DiskBlock: > ## This creates a file on disk on which the data-to-upload is written. This > file gets deleted in startUpload().close(). > > startUpload() gives an object of BlockUploadData which gives method of > `toByteArray()` which is used in abfsOutputStream to get the byteArray in the > dataBlock. > > Method which uses the DataBlock object: > https://github.com/apache/hadoop/blob/fac7d26c5d7f791565cc3ab45d079e2cca725f95/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java#L298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.
[ https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767456#comment-17767456 ] ASF GitHub Bot commented on HADOOP-18873: - hadoop-yetus commented on PR #6105: URL: https://github.com/apache/hadoop/pull/6105#issuecomment-1729106917 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 11m 12s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +0 :ok: | mvndep | 13m 35s | | Maven dependency ordering for branch | | -1 :x: | mvninstall | 37m 56s | [/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6105/1/artifact/out/branch-mvninstall-root.txt) | root in branch-3.3 failed. | | +1 :green_heart: | compile | 20m 17s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 2m 45s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 2m 43s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 41s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 3m 55s | | branch-3.3 passed | | -1 :x: | shadedclient | 39m 2s | | branch has errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 33s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 41s | | the patch passed | | +1 :green_heart: | compile | 19m 13s | | the patch passed | | +1 :green_heart: | javac | 19m 13s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 2m 46s | | the patch passed | | +1 :green_heart: | mvnsite | 2m 49s | | the patch passed | | +1 :green_heart: | javadoc | 1m 46s | | the patch passed | | +1 :green_heart: | spotbugs | 4m 23s | | the patch passed | | -1 :x: | shadedclient | 38m 30s | | patch has errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 18m 31s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 29s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 1m 8s | | The patch does not generate ASF License warnings. | | | | 233m 33s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6105/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6105 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 6557821a6162 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 3f87cc52cac5d7dd57a6e65ca1b26a34aa57781d | | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~18.04-b09 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6105/1/testReport/ | | Max. process+thread count | 3152 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-azure U: . | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6105/1/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > ABFS: AbfsOutputStream doesnt close DataBlocks object. > -- > > Key: HADOOP-18873 > URL: https://issues.apache.org/jira/browse/HADOOP-18873 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.4 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.3.4 > > > AbfsOutputStream doesnt close the dataBlock object created for the upload. > What is the implication of not doing that: > DataBlocks
[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.
[ https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17767372#comment-17767372 ] ASF GitHub Bot commented on HADOOP-18873: - saxenapranav opened a new pull request, #6105: URL: https://github.com/apache/hadoop/pull/6105 JIRA: https://issues.apache.org/jira/browse/HADOOP-18873 Trunk PR: https://github.com/apache/hadoop/pull/6010 Commit cherrypicked from trunk: https://github.com/apache/hadoop/commit/f24b73e5f3ac640f491231f02d9d8afaf1855b5c AGGREGATED TEST RESULT HNS-OAuth [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [INFO] [ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 5 [INFO] Results: [INFO] [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:336 » TestTimedOut test timed o... [INFO] [ERROR] Tests run: 588, Failures: 0, Errors: 1, Skipped: 54 [INFO] Results: [INFO] [WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 41 HNS-SharedKey [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [INFO] [ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 5 [INFO] Results: [INFO] [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:336 » TestTimedOut test timed o... [INFO] [ERROR] Tests run: 588, Failures: 0, Errors: 1, Skipped: 54 [INFO] Results: [INFO] [WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 41 NonHNS-SharedKey [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [INFO] [ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 11 [INFO] Results: [INFO] [ERROR] Failures: [ERROR] ITestAzureBlobFileSystemRandomRead.testSkipBounds:218->Assert.assertTrue:42->Assert.fail:89 There should not be any network I/O (elapsedTimeMs=25). [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:344->lambda$testAcquireRetry$6:345 » TestTimedOut [INFO] [ERROR] Tests run: 588, Failures: 1, Errors: 1, Skipped: 277 [INFO] Results: [INFO] [WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 44 AppendBlob-HNS-OAuth [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [INFO] [ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 5 [INFO] Results: [INFO] [ERROR] Failures: [ERROR] ITestAzureBlobFileSystemRandomRead.testValidateSeekBounds:269->Assert.assertTrue:42->Assert.fail:89 There should not be any network I/O (elapsedTimeMs=113). [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:344->lambda$testAcquireRetry$6:345 » TestTimedOut [INFO] [ERROR] Tests run: 588, Failures: 1, Errors: 1, Skipped: 54 [INFO] Results: [INFO] [WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 41 Time taken: 45 mins 16 secs. azureuser@Hadoop-VM-EAST2:~/hadoop/hadoop-tools/hadoop-azure$ git log commit 3f87cc52cac5d7dd57a6e65ca1b26a34aa57781d (HEAD -> branch-3.3_HADOOP-18873, origin/branch-3.3_HADOOP-18873) Author: Pranav Saxena <108325433+saxenapra...@users.noreply.github.com> Date: Wed Sep 20 01:54:36 2023 -0700 HADOOP-18873. ABFS: AbfsOutputStream doesnt close DataBlocks object. (#6010) AbfsOutputStream to close the dataBlock object created for the upload. Contributed By: Pranav Saxena > ABFS: AbfsOutputStream doesnt close DataBlocks object. > -- > > Key: HADOOP-18873 > URL: https://issues.apache.org/jira/browse/HADOOP-18873 > Project: Hadoop Common >
[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.
[ https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764793#comment-17764793 ] ASF GitHub Bot commented on HADOOP-18873: - steveloughran commented on PR #6010: URL: https://github.com/apache/hadoop/pull/6010#issuecomment-1717997799 ...handing off review to @mehakmeet; if he's happy it's good > ABFS: AbfsOutputStream doesnt close DataBlocks object. > -- > > Key: HADOOP-18873 > URL: https://issues.apache.org/jira/browse/HADOOP-18873 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.4 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.3.4 > > > AbfsOutputStream doesnt close the dataBlock object created for the upload. > What is the implication of not doing that: > DataBlocks has three implementations: > # ByteArrayBlock > ## This creates an object of DataBlockByteArrayOutputStream (child of > ByteArrayOutputStream: wrapper arround byte-arrray for populating, reading > the array. > ## This gets GCed. > # ByteBufferBlock: > ## There is a defined *DirectBufferPool* from which it tries to request the > directBuffer. > ## If nothing in the pool, a new directBuffer is created. > ## the `close` method on the this object has the responsiblity of returning > back the buffer to pool so it can be reused. > ## Since we are not calling the `close`: > ### The pool is rendered of less use, since each request creates a new > directBuffer from memory. > ### All the object can be GCed and the direct-memory allocated may be > returned on the GC. What if the process crashes, the memory never goes back > and cause memory issue on the machine. > # DiskBlock: > ## This creates a file on disk on which the data-to-upload is written. This > file gets deleted in startUpload().close(). > > startUpload() gives an object of BlockUploadData which gives method of > `toByteArray()` which is used in abfsOutputStream to get the byteArray in the > dataBlock. > > Method which uses the DataBlock object: > https://github.com/apache/hadoop/blob/fac7d26c5d7f791565cc3ab45d079e2cca725f95/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java#L298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.
[ https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764761#comment-17764761 ] ASF GitHub Bot commented on HADOOP-18873: - hadoop-yetus commented on PR #6010: URL: https://github.com/apache/hadoop/pull/6010#issuecomment-1717900485 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 29s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 34s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 21m 0s | | trunk passed | | +1 :green_heart: | compile | 11m 2s | | trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | compile | 9m 33s | | trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | checkstyle | 2m 26s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 52s | | trunk passed | | +1 :green_heart: | javadoc | 1m 37s | | trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 19s | | trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | spotbugs | 2m 30s | | trunk passed | | +1 :green_heart: | shadedclient | 23m 21s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 26s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 0m 56s | | the patch passed | | +1 :green_heart: | compile | 11m 18s | | the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javac | 11m 18s | | the patch passed | | +1 :green_heart: | compile | 9m 46s | | the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | javac | 9m 46s | | the patch passed | | +1 :green_heart: | blanks | 0m 1s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 2m 24s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 49s | | the patch passed | | +1 :green_heart: | javadoc | 1m 30s | | the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 21s | | the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | +1 :green_heart: | spotbugs | 2m 55s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 27s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 16m 34s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 11s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 49s | | The patch does not generate ASF License warnings. | | | | 169m 27s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6010/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6010 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 4090ed52a0ae 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d9da57b16ea00f4e79444a5109807ac6fef26e6c | | Default Java | Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6010/6/testReport/ | | Max. process+thread count | 1738 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-azure U: . | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6010/6/console | |
[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.
[ https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764693#comment-17764693 ] ASF GitHub Bot commented on HADOOP-18873: - saxenapranav commented on PR #6010: URL: https://github.com/apache/hadoop/pull/6010#issuecomment-1717686794 Thanks @mehakmeet for the review. I have taken the comments. Thank you. > ABFS: AbfsOutputStream doesnt close DataBlocks object. > -- > > Key: HADOOP-18873 > URL: https://issues.apache.org/jira/browse/HADOOP-18873 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.4 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.3.4 > > > AbfsOutputStream doesnt close the dataBlock object created for the upload. > What is the implication of not doing that: > DataBlocks has three implementations: > # ByteArrayBlock > ## This creates an object of DataBlockByteArrayOutputStream (child of > ByteArrayOutputStream: wrapper arround byte-arrray for populating, reading > the array. > ## This gets GCed. > # ByteBufferBlock: > ## There is a defined *DirectBufferPool* from which it tries to request the > directBuffer. > ## If nothing in the pool, a new directBuffer is created. > ## the `close` method on the this object has the responsiblity of returning > back the buffer to pool so it can be reused. > ## Since we are not calling the `close`: > ### The pool is rendered of less use, since each request creates a new > directBuffer from memory. > ### All the object can be GCed and the direct-memory allocated may be > returned on the GC. What if the process crashes, the memory never goes back > and cause memory issue on the machine. > # DiskBlock: > ## This creates a file on disk on which the data-to-upload is written. This > file gets deleted in startUpload().close(). > > startUpload() gives an object of BlockUploadData which gives method of > `toByteArray()` which is used in abfsOutputStream to get the byteArray in the > dataBlock. > > Method which uses the DataBlock object: > https://github.com/apache/hadoop/blob/fac7d26c5d7f791565cc3ab45d079e2cca725f95/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java#L298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.
[ https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764691#comment-17764691 ] ASF GitHub Bot commented on HADOOP-18873: - saxenapranav commented on PR #6010: URL: https://github.com/apache/hadoop/pull/6010#issuecomment-1717685620 AGGREGATED TEST RESULT HNS-OAuth [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [INFO] [ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 5 [INFO] Results: [INFO] [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:329 » TestTimedOut test timed o... [INFO] [ERROR] Tests run: 589, Failures: 0, Errors: 1, Skipped: 54 [INFO] Results: [INFO] [WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 41 HNS-SharedKey [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [INFO] [ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 5 [INFO] Results: [INFO] [ERROR] Failures: [ERROR] ITestAzureBlobFileSystemRandomRead.testSkipBounds:218->Assert.assertTrue:42->Assert.fail:89 There should not be any network I/O (elapsedTimeMs=24). [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:336 » TestTimedOut test timed o... [INFO] [ERROR] Tests run: 589, Failures: 1, Errors: 1, Skipped: 54 [INFO] Results: [INFO] [WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 41 NonHNS-SharedKey [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [INFO] [ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 11 [INFO] Results: [INFO] [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:344->lambda$testAcquireRetry$6:345 » TestTimedOut [INFO] [ERROR] Tests run: 589, Failures: 0, Errors: 1, Skipped: 277 [INFO] Results: [INFO] [WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 44 AppendBlob-HNS-OAuth [INFO] Results: [INFO] [ERROR] Failures: [ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" [INFO] [ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 5 [INFO] Results: [INFO] [ERROR] Errors: [ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:344->lambda$testAcquireRetry$6:345 » TestTimedOut [INFO] [ERROR] Tests run: 589, Failures: 0, Errors: 1, Skipped: 54 [INFO] Results: [INFO] [WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 41 Time taken: 48 mins 26 secs. azureuser@Hadoop-VM-EAST2:~/hadoop/hadoop-tools/hadoop-azure$ git log commit d9da57b16ea00f4e79444a5109807ac6fef26e6c (HEAD -> HADOOP-18873, origin/HADOOP-18873) Author: Pranav Saxena <> Date: Wed Sep 13 06:02:03 2023 -0700 close with IoUtils; test refactors; > ABFS: AbfsOutputStream doesnt close DataBlocks object. > -- > > Key: HADOOP-18873 > URL: https://issues.apache.org/jira/browse/HADOOP-18873 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.4 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Fix For: 3.3.4 > > > AbfsOutputStream doesnt close the dataBlock object created for the upload. > What is the implication of not doing that: > DataBlocks has three implementations: > # ByteArrayBlock > ## This creates an object of DataBlockByteArrayOutputStream (child of > ByteArrayOutputStream: wrapper arround byte-arrray for populating, reading > the array. > ## This gets GCed. > # ByteBufferBlock: > ## There is a defined *DirectBufferPool* from which it tries to request the > directBuffer. > ##
[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.
[ https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761263#comment-17761263 ] Pranav Saxena commented on HADOOP-18873: pr: [HADOOP-18873. ABFS: AbfsOutputStream doesnt close DataBlocks object. by saxenapranav · Pull Request #6010 · apache/hadoop (github.com)|https://github.com/apache/hadoop/pull/6010] > ABFS: AbfsOutputStream doesnt close DataBlocks object. > -- > > Key: HADOOP-18873 > URL: https://issues.apache.org/jira/browse/HADOOP-18873 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.4 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Fix For: 3.3.4 > > > AbfsOutputStream doesnt close the dataBlock object created for the upload. > What is the implication of not doing that: > DataBlocks has three implementations: > # ByteArrayBlock > ## This creates an object of DataBlockByteArrayOutputStream (child of > ByteArrayOutputStream: wrapper arround byte-arrray for populating, reading > the array. > ## This gets GCed. > # ByteBufferBlock: > ## There is a defined *DirectBufferPool* from which it tries to request the > directBuffer. > ## If nothing in the pool, a new directBuffer is created. > ## the `close` method on the this object has the responsiblity of returning > back the buffer to pool so it can be reused. > ## Since we are not calling the `close`: > ### The pool is rendered of less use, since each request creates a new > directBuffer from memory. > ### All the object can be GCed and the direct-memory allocated may be > returned on the GC. What if the process crashes, the memory never goes back > and cause memory issue on the machine. > # DiskBlock: > ## This creates a file on disk on which the data-to-upload is written. This > file gets deleted in startUpload().close(). > > startUpload() gives an object of BlockUploadData which gives method of > `toByteArray()` which is used in abfsOutputStream to get the byteArray in the > dataBlock. > > Method which uses the DataBlock object: > https://github.com/apache/hadoop/blob/fac7d26c5d7f791565cc3ab45d079e2cca725f95/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java#L298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.
[ https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760540#comment-17760540 ] Steve Loughran commented on HADOOP-18873: - yeah, this seems serious +[~mehakmeet] > ABFS: AbfsOutputStream doesnt close DataBlocks object. > -- > > Key: HADOOP-18873 > URL: https://issues.apache.org/jira/browse/HADOOP-18873 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.3.4 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Fix For: 3.3.4 > > > AbfsOutputStream doesnt close the dataBlock object created for the upload. > What is the implication of not doing that: > DataBlocks has three implementations: > # ByteArrayBlock > ## This creates an object of DataBlockByteArrayOutputStream (child of > ByteArrayOutputStream: wrapper arround byte-arrray for populating, reading > the array. > ## This gets GCed. > # ByteBufferBlock: > ## There is a defined *DirectBufferPool* from which it tries to request the > directBuffer. > ## If nothing in the pool, a new directBuffer is created. > ## the `close` method on the this object has the responsiblity of returning > back the buffer to pool so it can be reused. > ## Since we are not calling the `close`: > ### The pool is rendered of less use, since each request creates a new > directBuffer from memory. > ### All the object can be GCed and the direct-memory allocated may be > returned on the GC. What if the process crashes, the memory never goes back > and cause memory issue on the machine. > # DiskBlock: > ## This creates a file on disk on which the data-to-upload is written. This > file gets deleted in startUpload().close(). > > startUpload() gives an object of BlockUploadData which gives method of > `toByteArray()` which is used in abfsOutputStream to get the byteArray in the > dataBlock. > > Method which uses the DataBlock object: > https://github.com/apache/hadoop/blob/fac7d26c5d7f791565cc3ab45d079e2cca725f95/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java#L298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org