[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745188#comment-17745188 ] Steve Loughran commented on HADOOP-18752: - I have pulled in everything but the default change into branch-3.3, but not marking this as "fixed" there as it isn't; just the logging an defaults are in sync. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745187#comment-17745187 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran merged PR #5859: URL: https://github.com/apache/hadoop/pull/5859 > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745186#comment-17745186 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran commented on PR #5859: URL: https://github.com/apache/hadoop/pull/5859#issuecomment-1644342424 OK, backport has gone in cleanly. This does not change the default value, as discussed, even though I'd like to. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745175#comment-17745175 ] ASF GitHub Bot commented on HADOOP-18752: - hadoop-yetus commented on PR #5859: URL: https://github.com/apache/hadoop/pull/5859#issuecomment-1644290192 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 5m 39s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 5s | | branch-3.3 passed | | +1 :green_heart: | compile | 0m 30s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 0m 24s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 0m 37s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 0m 30s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 0m 54s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 23m 48s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 24s | | the patch passed | | +1 :green_heart: | javac | 0m 24s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 15s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 27s | | the patch passed | | +1 :green_heart: | javadoc | 0m 19s | | the patch passed | | +1 :green_heart: | spotbugs | 0m 50s | | the patch passed | | +1 :green_heart: | shadedclient | 23m 14s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 20s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 0m 28s | | The patch does not generate ASF License warnings. | | | | 99m 11s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5859 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux ca5c4acd6d1b 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 25e932285ac5713fcc1033ffb6bcf729a0e88c47 | | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~18.04-b09 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/4/testReport/ | | Max. process+thread count | 681 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/4/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#8200
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745104#comment-17745104 ] ASF GitHub Bot commented on HADOOP-18752: - hadoop-yetus commented on PR #5859: URL: https://github.com/apache/hadoop/pull/5859#issuecomment-1643998812 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 7m 18s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 52m 43s | | branch-3.3 passed | | +1 :green_heart: | compile | 0m 43s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 0m 36s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 0m 50s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 0m 42s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 1m 17s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 37m 48s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 40s | | the patch passed | | +1 :green_heart: | compile | 0m 33s | | the patch passed | | +1 :green_heart: | javac | 0m 33s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 22s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 38s | | the patch passed | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed | | +1 :green_heart: | spotbugs | 1m 10s | | the patch passed | | +1 :green_heart: | shadedclient | 37m 9s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 47s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 0m 40s | | The patch does not generate ASF License warnings. | | | | 149m 39s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5859 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux 7e5380e1a46b 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / b986515adc2ece7f099809305ed4fb83e030bfa2 | | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~18.04-b09 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/3/testReport/ | | Max. process+thread count | 552 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/3/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#8200
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745028#comment-17745028 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran commented on code in PR #5859: URL: https://github.com/apache/hadoop/pull/5859#discussion_r1269279118 ## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md: ## @@ -339,16 +339,19 @@ Hadoop supports [different policies for directory marker retention](directory_ma -essentially the classic "delete" and the higher-performance "keep" options; "authoritative" is just "keep" restricted to a part of the bucket. -Example: test with `markers=delete` + +Example: test with `markers=keep` ``` -mvn verify -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=delete +mvn verify -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=keep ``` -Example: test with `markers=keep` +This is the default and does not need to be explicitly set. Review Comment: oops; wrong > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744779#comment-17744779 ] ASF GitHub Bot commented on HADOOP-18752: - hadoop-yetus commented on PR #5859: URL: https://github.com/apache/hadoop/pull/5859#issuecomment-1642724285 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 11m 4s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 53m 22s | | branch-3.3 passed | | +1 :green_heart: | compile | 0m 38s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 0m 30s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 0m 45s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 0m 34s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 1m 12s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 42m 50s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 39s | | the patch passed | | +1 :green_heart: | compile | 0m 31s | | the patch passed | | +1 :green_heart: | javac | 0m 31s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 19s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 36s | | the patch passed | | +1 :green_heart: | javadoc | 0m 25s | | the patch passed | | +1 :green_heart: | spotbugs | 1m 14s | | the patch passed | | +1 :green_heart: | shadedclient | 40m 14s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 31s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 0m 35s | | The patch does not generate ASF License warnings. | | | | 160m 52s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5859 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux 4b8cb5861852 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 6a2b198dd70e08a1e93fddb8af44c3b752dd5f73 | | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~18.04-b09 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/2/testReport/ | | Max. process+thread count | 527 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/2/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#8200
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744777#comment-17744777 ] ASF GitHub Bot commented on HADOOP-18752: - hadoop-yetus commented on PR #5859: URL: https://github.com/apache/hadoop/pull/5859#issuecomment-1642694187 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 55s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 49m 15s | | branch-3.3 passed | | +1 :green_heart: | compile | 0m 42s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 0m 37s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 0m 50s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 0m 42s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 1m 16s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 37m 32s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 43s | | the patch passed | | +1 :green_heart: | compile | 0m 32s | | the patch passed | | +1 :green_heart: | javac | 0m 32s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 22s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 38s | | the patch passed | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed | | +1 :green_heart: | spotbugs | 1m 11s | | the patch passed | | +1 :green_heart: | shadedclient | 37m 19s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 42s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 0m 41s | | The patch does not generate ASF License warnings. | | | | 139m 44s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5859 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux d6e4a71d159d 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 6a2b198dd70e08a1e93fddb8af44c3b752dd5f73 | | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~18.04-b09 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/1/testReport/ | | Max. process+thread count | 585 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5859/1/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#8200
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744739#comment-17744739 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran commented on PR #5859: URL: https://github.com/apache/hadoop/pull/5859#issuecomment-1642515852 testing: s3 london. this is a backport and as it doesn't include the contentious issue "actually changing the switch" then I'm happy to cherrypick as is. lets see what the tests say now I've rolled back the default > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744736#comment-17744736 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran commented on PR #5689: URL: https://github.com/apache/hadoop/pull/5689#issuecomment-1642513305 Update: I have a pr of this for branch-3.3 which does everything but changing the default/documenting this change #5859 this is so those releases stop warning about incompatibility and to keep the code more in sync between branches. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744735#comment-17744735 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran opened a new pull request, #5859: URL: https://github.com/apache/hadoop/pull/5859 This change has all of PR #5689 *except* for changing the default value of marker retention from keep to delete. 1. leaves the default value of fs.s3a.directory.marker.retention at "delete" 2. no longer prints a message when an S3A FS instances is instantiated with any option other than delete. 3. Updates the directory marker documentation Switching to marker retention improves performance on any S3 bucket as there are no needless marker DELETE requests -leading to a reduction in write IOPS and and any delays waiting for the DELETE call to finish. There are *very* significant improvements on versioned buckets, where tombstone markers slow down LIST operations: the more tombstones there are, the worse query planning gets. Having versioning enabled on production stores is the foundation of any data protection strategy, so this has tangible benefits in production. Marker deletion is *not* compatible with older hadoop releases; specifically - Hadoop branch 2 < 2.10.2 - Any release of Hadoop 3.0.x and Hadoop 3.1.x - Hadoop 3.2.0 and 3.2.1 - Hadoop 3.3.0 Incompatible releases have no problems reading data in stores where markers are retained, but can get confused when deleting or renaming directories. Contributed by Steve Loughran Change-Id: Ic9a05357a4b1b1ff6dfecf8b0f30e1eeedb2fe75 ### Description of PR ### How was this patch tested? ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730508#comment-17730508 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran merged PR #5689: URL: https://github.com/apache/hadoop/pull/5689 > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730225#comment-17730225 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran commented on PR #5689: URL: https://github.com/apache/hadoop/pull/5689#issuecomment-1581239126 @dannycjones: you happy with the changes now? I've got Ayush's upvote already > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729821#comment-17729821 ] ASF GitHub Bot commented on HADOOP-18752: - hadoop-yetus commented on PR #5689: URL: https://github.com/apache/hadoop/pull/5689#issuecomment-1579232711 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 4s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 42m 5s | | trunk passed | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 | | +1 :green_heart: | compile | 0m 30s | | trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 | | +1 :green_heart: | checkstyle | 0m 31s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 37s | | trunk passed | | +1 :green_heart: | javadoc | 0m 28s | | trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 | | +1 :green_heart: | javadoc | 0m 29s | | trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 | | +1 :green_heart: | spotbugs | 1m 14s | | trunk passed | | +1 :green_heart: | shadedclient | 23m 40s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 31s | | the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 | | +1 :green_heart: | javac | 0m 31s | | the patch passed | | +1 :green_heart: | compile | 0m 24s | | the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 | | +1 :green_heart: | javac | 0m 24s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 19s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 29s | | the patch passed | | +1 :green_heart: | javadoc | 0m 13s | | the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 | | +1 :green_heart: | javadoc | 0m 22s | | the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 | | +1 :green_heart: | spotbugs | 1m 4s | | the patch passed | | +1 :green_heart: | shadedclient | 23m 10s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 20s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 0m 34s | | The patch does not generate ASF License warnings. | | | | 103m 13s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5689/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5689 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux f4f59a935c3f 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 1b33b71f2323508c50e543468921b0d63f953141 | | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5689/2/testReport/ | | Max. process+thread count | 588 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/j
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729218#comment-17729218 ] ASF GitHub Bot commented on HADOOP-18752: - dannycjones commented on code in PR #5689: URL: https://github.com/apache/hadoop/pull/5689#discussion_r1217662352 ## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/directory_markers.md: ## @@ -12,35 +12,40 @@ limitations under the License. See accompanying LICENSE file. --> -# Experimental: Controlling the S3A Directory Marker Behavior +# Controlling the S3A Directory Marker Behavior -This document discusses an experimental feature of the S3A -connector since Hadoop 3.3.1: the ability to retain directory -marker objects above paths containing files or subdirectories. +This document discusses an performance feature of the S3A +connector: directory markers are not deleted unless the +client is explicitly configured to do so. ## Critical: this is not backwards compatible! This document shows how the performance of S3 I/O, especially applications creating many files (for example Apache Hive) or working with versioned S3 buckets can increase performance by changing the S3A directory marker retention policy. -Changing the policy from the default value, `"delete"` _is not backwards compatible_. +The default policy in this release of hadoop is "keep", +which _is not backwards compatible_ with hadoop versions +released before 2021. -Versions of Hadoop which are incompatible with other marker retention policies, -as of August 2020. +The compatibility table of older releases is as follows: -| Branch| Compatible Since | Supported | -||--|-| -| Hadoop 2.x | n/a| WONTFIX | -| Hadoop 3.0 | check | Read-only | -| Hadoop 3.1 | check | Read-only | -| Hadoop 3.2 | check | Read-only | -| Hadoop 3.3 | 3.3.1 | Done| +| Branch | Compatible Since | Supported | Released | +||--|---|--| +| Hadoop 2.x | 2.10.2 | Read-only | 05/2022 | +| Hadoop 3.0 | n/a | WONTFIX | | +| Hadoop 3.1 | n/a | WONTFIX | | +| Hadoop 3.2 | 3.2.2| Read-only | 01/2022 | +| Hadoop 3.3 | 3.3.1| Done | 01/2021 | Review Comment: nice, that's great then > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729217#comment-17729217 ] ASF GitHub Bot commented on HADOOP-18752: - dannycjones commented on code in PR #5689: URL: https://github.com/apache/hadoop/pull/5689#discussion_r1217662040 ## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/directory_markers.md: ## @@ -161,7 +176,7 @@ When a file is created under a path, the directory marker is deleted. And when a file is deleted, if it was the last file in the directory, the marker is recreated. -And, historically, When a path is listed, if a marker to that path is found, *it +And, historically, when a path is listed, if a marker to that path is found, *it has been interpreted as an empty directory.* Review Comment: ACK. For public documentation, probably makes sense to keep it as is. It was just a little confusing when digging into the detail. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728794#comment-17728794 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran commented on code in PR #5689: URL: https://github.com/apache/hadoop/pull/5689#discussion_r1214527663 ## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/directory_markers.md: ## @@ -161,7 +176,7 @@ When a file is created under a path, the directory marker is deleted. And when a file is deleted, if it was the last file in the directory, the marker is recreated. -And, historically, When a path is listed, if a marker to that path is found, *it +And, historically, when a path is listed, if a marker to that path is found, *it has been interpreted as an empty directory.* Review Comment: it is for some specific codepaths which do a probe which explicitly looks for empty dirs, rm and mv in particular. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728793#comment-17728793 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran commented on code in PR #5689: URL: https://github.com/apache/hadoop/pull/5689#discussion_r1214526572 ## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/directory_markers.md: ## @@ -12,35 +12,40 @@ limitations under the License. See accompanying LICENSE file. --> -# Experimental: Controlling the S3A Directory Marker Behavior +# Controlling the S3A Directory Marker Behavior -This document discusses an experimental feature of the S3A -connector since Hadoop 3.3.1: the ability to retain directory -marker objects above paths containing files or subdirectories. +This document discusses an performance feature of the S3A +connector: directory markers are not deleted unless the +client is explicitly configured to do so. ## Critical: this is not backwards compatible! This document shows how the performance of S3 I/O, especially applications creating many files (for example Apache Hive) or working with versioned S3 buckets can increase performance by changing the S3A directory marker retention policy. -Changing the policy from the default value, `"delete"` _is not backwards compatible_. +The default policy in this release of hadoop is "keep", +which _is not backwards compatible_ with hadoop versions +released before 2021. -Versions of Hadoop which are incompatible with other marker retention policies, -as of August 2020. +The compatibility table of older releases is as follows: -| Branch| Compatible Since | Supported | -||--|-| -| Hadoop 2.x | n/a| WONTFIX | -| Hadoop 3.0 | check | Read-only | -| Hadoop 3.1 | check | Read-only | -| Hadoop 3.2 | check | Read-only | -| Hadoop 3.3 | 3.3.1 | Done| +| Branch | Compatible Since | Supported | Released | +||--|---|--| +| Hadoop 2.x | 2.10.2 | Read-only | 05/2022 | +| Hadoop 3.0 | n/a | WONTFIX | | +| Hadoop 3.1 | n/a | WONTFIX | | +| Hadoop 3.2 | 3.2.2| Read-only | 01/2022 | +| Hadoop 3.3 | 3.3.1| Done | 01/2021 | Review Comment: yeah, there was some format error already fixed. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728686#comment-17728686 ] ASF GitHub Bot commented on HADOOP-18752: - dannycjones commented on PR #5689: URL: https://github.com/apache/hadoop/pull/5689#issuecomment-1573525360 Sorry for jumping in at the last minute with these. Basically want to try and make sure users are able to reason as easily as possible about when its safe to flip over from `delete` to `keep`. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728683#comment-17728683 ] ASF GitHub Bot commented on HADOOP-18752: - dannycjones commented on code in PR #5689: URL: https://github.com/apache/hadoop/pull/5689#discussion_r1214208594 ## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/directory_markers.md: ## @@ -161,7 +176,7 @@ When a file is created under a path, the directory marker is deleted. And when a file is deleted, if it was the last file in the directory, the marker is recreated. -And, historically, When a path is listed, if a marker to that path is found, *it +And, historically, when a path is listed, if a marker to that path is found, *it has been interpreted as an empty directory.* Review Comment: (This isn't added in this PR but...) is this really true? I tried an integ test using `listFiles` on the Hadoop 3.0 code base. It seemed happy. Is it worth being specific with what will or won't make this assumption? ## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/directory_markers.md: ## @@ -237,29 +252,19 @@ of backwards compatibility. There is now an option `fs.s3a.directory.marker.retention` which controls how markers are managed when new files are created -*Default* `delete`: a request is issued to delete any parental directory markers +1.`delete`: a request is issued to delete any parental directory markers Review Comment: markdown won't like this ```suggestion 1. `delete`: a request is issued to delete any parental directory markers ``` > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728679#comment-17728679 ] ASF GitHub Bot commented on HADOOP-18752: - dannycjones commented on code in PR #5689: URL: https://github.com/apache/hadoop/pull/5689#discussion_r1214206939 ## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/directory_markers.md: ## @@ -12,35 +12,40 @@ limitations under the License. See accompanying LICENSE file. --> -# Experimental: Controlling the S3A Directory Marker Behavior +# Controlling the S3A Directory Marker Behavior -This document discusses an experimental feature of the S3A -connector since Hadoop 3.3.1: the ability to retain directory -marker objects above paths containing files or subdirectories. +This document discusses an performance feature of the S3A +connector: directory markers are not deleted unless the +client is explicitly configured to do so. Review Comment: if this PR gets updated, small, typo to fix ```suggestion This document discusses a performance feature of the S3A connector: directory markers are not deleted unless the client is explicitly configured to do so. ``` > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728672#comment-17728672 ] ASF GitHub Bot commented on HADOOP-18752: - dannycjones commented on code in PR #5689: URL: https://github.com/apache/hadoop/pull/5689#discussion_r1214190161 ## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/directory_markers.md: ## @@ -12,35 +12,40 @@ limitations under the License. See accompanying LICENSE file. --> -# Experimental: Controlling the S3A Directory Marker Behavior +# Controlling the S3A Directory Marker Behavior -This document discusses an experimental feature of the S3A -connector since Hadoop 3.3.1: the ability to retain directory -marker objects above paths containing files or subdirectories. +This document discusses an performance feature of the S3A +connector: directory markers are not deleted unless the +client is explicitly configured to do so. ## Critical: this is not backwards compatible! This document shows how the performance of S3 I/O, especially applications creating many files (for example Apache Hive) or working with versioned S3 buckets can increase performance by changing the S3A directory marker retention policy. -Changing the policy from the default value, `"delete"` _is not backwards compatible_. +The default policy in this release of hadoop is "keep", +which _is not backwards compatible_ with hadoop versions +released before 2021. -Versions of Hadoop which are incompatible with other marker retention policies, -as of August 2020. +The compatibility table of older releases is as follows: -| Branch| Compatible Since | Supported | -||--|-| -| Hadoop 2.x | n/a| WONTFIX | -| Hadoop 3.0 | check | Read-only | -| Hadoop 3.1 | check | Read-only | -| Hadoop 3.2 | check | Read-only | -| Hadoop 3.3 | 3.3.1 | Done| +| Branch | Compatible Since | Supported | Released | +||--|---|--| +| Hadoop 2.x | 2.10.2 | Read-only | 05/2022 | +| Hadoop 3.0 | n/a | WONTFIX | | +| Hadoop 3.1 | n/a | WONTFIX | | +| Hadoop 3.2 | 3.2.2| Read-only | 01/2022 | +| Hadoop 3.3 | 3.3.1| Done | 01/2021 | Review Comment: Thanks for updating this with the extra info. Do we know why the Hadoop webpages aren't formatting the original table? https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/directory_markers.html#The_Problem_with_Directory_Markers > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728409#comment-17728409 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran commented on PR #5689: URL: https://github.com/apache/hadoop/pull/5689#issuecomment-1572316027 Thanks. I think the tests were all good but will rerun to be 100% sure. > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728122#comment-17728122 ] ASF GitHub Bot commented on HADOOP-18752: - steveloughran commented on PR #5689: URL: https://github.com/apache/hadoop/pull/5689#issuecomment-1570827515 noted. well, let's target 3.4 at the very least and tag as incompatible > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"
[ https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727884#comment-17727884 ] ASF GitHub Bot commented on HADOOP-18752: - ayushtkn commented on PR #5689: URL: https://github.com/apache/hadoop/pull/5689#issuecomment-1569931307 yep, it is like that, lot of discussions and tickets around this, example: [HDFS-13505](https://issues.apache.org/jira/browse/HDFS-13505), this is also marked as incompatible and was pushed only to trunk. This comment also says the same thing (https://issues.apache.org/jira/browse/HDFS-13505?focusedCommentId=16854777&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16854777) may be the reason is like if someone had a use case like he explicitly wanted the conf to be "delete" for whatever reasons and the default value was also "delete", he didn't configure it considering the default value, now if you change it to "keep", that guy who explicitly wanted the value to be delete, he has to change and have to configure it to "delete" to preserve his old behaviour. Not against this change, just telling the generic stuff around config defaults, what I have read or know about the compat :-) > Change fs.s3a.directory.marker.retention to "keep" > -- > > Key: HADOOP-18752 > URL: https://issues.apache.org/jira/browse/HADOOP-18752 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > Change the default value of "fs.s3a.directory.marker.retention" to keep; > update docs to match. > maybe include with HADOOP-17802 so we don't blow up with fewer markers being > created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org