[jira] [Commented] (HADOOP-19105) S3A: Recover from Vector IO read failures

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888016#comment-17888016
 ] 

ASF GitHub Bot commented on HADOOP-19105:
-

steveloughran opened a new pull request, #7105:
URL: https://github.com/apache/hadoop/pull/7105

   
   Add a new releaser method, which then is invoked to release buffers on 
failure.
   
   It is a bit contrived how we try not break external implementations when 
adding a new default implementation to PositionedReadable: the releaser will be 
lost unless they do so.
   
   
   
   
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> S3A: Recover from Vector IO read failures
> -
>
> Key: HADOOP-19105
> URL: https://issues.apache.org/jira/browse/HADOOP-19105
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: This keeps the public API stable
>Reporter: Steve Loughran
>Priority: Major
>
> s3a vector IO doesn't try to recover from read failures the way read() does.
> Need to
> * abort HTTP stream if considered needed
> * retry active read which failed
> * but not those which had succeeded
> On a full failure we need to do something about any allocated buffer, which 
> means we really need the buffer pool {{ByteBufferPool}} to return or also 
> provide a "release" (Bytebuffer -> void) call which does the return.  we 
> would need to
> * add this as a new api with the implementations in s3a, local, rawlocal
> * classic single allocator method remaps to the new one with (() -> null) as 
> the response



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19105) S3A: Recover from Vector IO read failures

2024-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HADOOP-19105:

Labels: pull-request-available  (was: )

> S3A: Recover from Vector IO read failures
> -
>
> Key: HADOOP-19105
> URL: https://issues.apache.org/jira/browse/HADOOP-19105
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: This keeps the public API stable
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> s3a vector IO doesn't try to recover from read failures the way read() does.
> Need to
> * abort HTTP stream if considered needed
> * retry active read which failed
> * but not those which had succeeded
> On a full failure we need to do something about any allocated buffer, which 
> means we really need the buffer pool {{ByteBufferPool}} to return or also 
> provide a "release" (Bytebuffer -> void) call which does the return.  we 
> would need to
> * add this as a new api with the implementations in s3a, local, rawlocal
> * classic single allocator method remaps to the new one with (() -> null) as 
> the response



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19278) S3A: remove option to delete directory markers

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888005#comment-17888005
 ] 

ASF GitHub Bot commented on HADOOP-19278:
-

hadoop-yetus commented on PR #7052:
URL: https://github.com/apache/hadoop/pull/7052#issuecomment-2402835818

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 36s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 34 new or modified test files.  |
    _ trunk Compile Tests _ |
   | -1 :x: |  mvninstall  |   0m 55s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7052/3/artifact/out/branch-mvninstall-root.txt)
 |  root in trunk failed.  |
   | -1 :x: |  compile  |   4m  2s | 
[/branch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7052/3/artifact/out/branch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt)
 |  hadoop-aws in trunk failed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.  |
   | -1 :x: |  compile  |   0m 23s | 
[/branch-compile-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_422-8u422-b05-1~20.04-b05.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7052/3/artifact/out/branch-compile-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_422-8u422-b05-1~20.04-b05.txt)
 |  hadoop-aws in trunk failed with JDK Private 
Build-1.8.0_422-8u422-b05-1~20.04-b05.  |
   | -0 :warning: |  checkstyle  |   0m 20s | 
[/buildtool-branch-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7052/3/artifact/out/buildtool-branch-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  The patch fails to run checkstyle in hadoop-aws  |
   | -1 :x: |  mvnsite  |   0m 23s | 
[/branch-mvnsite-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7052/3/artifact/out/branch-mvnsite-hadoop-tools_hadoop-aws.txt)
 |  hadoop-aws in trunk failed.  |
   | -1 :x: |  javadoc  |   0m 23s | 
[/branch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7052/3/artifact/out/branch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt)
 |  hadoop-aws in trunk failed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.  |
   | -1 :x: |  javadoc  |   0m 24s | 
[/branch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_422-8u422-b05-1~20.04-b05.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7052/3/artifact/out/branch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_422-8u422-b05-1~20.04-b05.txt)
 |  hadoop-aws in trunk failed with JDK Private 
Build-1.8.0_422-8u422-b05-1~20.04-b05.  |
   | +1 :green_heart: |  spotbugs  |   2m 37s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |   3m 20s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | -1 :x: |  mvninstall  |   0m 23s | 
[/patch-mvninstall-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7052/3/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt)
 |  hadoop-aws in the patch failed.  |
   | -1 :x: |  compile  |   0m 23s | 
[/patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7052/3/artifact/out/patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt)
 |  hadoop-aws in the patch failed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.  |
   | -1 :x: |  javac  |   0m 23s | 
[/patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7052/3/artifact/out/patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.txt)
 |  hadoop-aws in the patch failed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04.  |
   | -1 :x: |  compile  |   0m 23s | 
[/patch-compile-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_422-8u422-b05-1~20.04-b05.txt](https:

[jira] [Commented] (HADOOP-15984) Update jersey from 1.19 to 2.x

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887986#comment-17887986
 ] 

ASF GitHub Bot commented on HADOOP-15984:
-

slfan1989 commented on code in PR #7019:
URL: https://github.com/apache/hadoop/pull/7019#discussion_r1793739270


##
hadoop-mapreduce-project/hadoop-mapreduce-client/pom.xml:
##
@@ -154,10 +154,21 @@
   provided
 
 
-  com.sun.jersey.jersey-test-framework
-  jersey-test-framework-grizzly2
+  org.glassfish.jersey.test-framework
+  jersey-test-framework-core
   test
 
+
+  org.glassfish.jersey.test-framework.providers
+  jersey-test-framework-provider-grizzly2
+  test
+  

Review Comment:
   You’re right. I will make the necessary changes to ensure the POM is concise.





> Update jersey from 1.19 to 2.x
> --
>
> Key: HADOOP-15984
> URL: https://issues.apache.org/jira/browse/HADOOP-15984
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15984) Update jersey from 1.19 to 2.x

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887982#comment-17887982
 ] 

ASF GitHub Bot commented on HADOOP-15984:
-

slfan1989 commented on code in PR #7019:
URL: https://github.com/apache/hadoop/pull/7019#discussion_r1793735243


##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/http/TestHttpServer.java:
##
@@ -448,17 +443,15 @@ public List getGroups(String user) throws 
IOException {
 
 @Override
 public Set getGroupsSet(String user) throws IOException {
-  Set result = new HashSet();
-  result.addAll(mapping.get(user));
-  return result;
+  return new HashSet<>(mapping.get(user));
 }
   }
 
   /**
* Verify the access for /logs, /stacks, /conf, and /logLevel
* servlets, when authentication filters are set, but authorization is not
* enabled.
-   * @throws Exception 
+   * @throws Exception if there is an error during, an exception will be 
thrown.

Review Comment:
   I added this comment because Yetus prompts us about JavaDoc compilation 
errors. While I included this description, I should focus more on the 
surrounding code context. I will improve it.





> Update jersey from 1.19 to 2.x
> --
>
> Key: HADOOP-15984
> URL: https://issues.apache.org/jira/browse/HADOOP-15984
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15984) Update jersey from 1.19 to 2.x

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887981#comment-17887981
 ] 

ASF GitHub Bot commented on HADOOP-15984:
-

slfan1989 commented on code in PR #7019:
URL: https://github.com/apache/hadoop/pull/7019#discussion_r1793730438


##
hadoop-common-project/hadoop-common/pom.xml:
##
@@ -93,104 +93,69 @@
   compile
 
 
-  javax.servlet
-  javax.servlet-api
+  jakarta.servlet.jsp
+  jakarta.servlet.jsp-api
   compile
 
 
-  jakarta.activation
-  jakarta.activation-api
-  runtime
-
-
-  org.eclipse.jetty
-  jetty-server
+  jakarta.ws.rs
+  jakarta.ws.rs-api
   compile
 
 
-  org.eclipse.jetty
-  jetty-util
-  compile
-
-
-  org.eclipse.jetty
-  jetty-servlet
+  org.glassfish.jersey.core

Review Comment:
   Actually, I'm not sure if this is the most streamlined version. Sometimes, 
local compilation passes, but after submitting to Yetus, I still encounter 
"class not found" errors or issues with certain rules during unit tests. I 
added dependencies one by one based on the errors until they disappeared. I can 
try to make these dependencies a bit more streamlined.





> Update jersey from 1.19 to 2.x
> --
>
> Key: HADOOP-15984
>     URL: https://issues.apache.org/jira/browse/HADOOP-15984
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15984) Update jersey from 1.19 to 2.x

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887977#comment-17887977
 ] 

ASF GitHub Bot commented on HADOOP-15984:
-

slfan1989 commented on PR #7019:
URL: https://github.com/apache/hadoop/pull/7019#issuecomment-2402636984

   > > the overall renovation time has taken longer than expected
   > 
   > I can't think of any big change where things took less than expected.
   > 
   > I'll have a quick look at it, but it's big enough I'll only be trying to 
spot the obvious issues i've hit before. after that it's well "merge and see"
   
   I initially thought I could complete this PR more quickly since the Yarn 
modules are similar, and modifying one would make the others relatively easier, 
until I encountered the timeline module(somewhat different from the previous 
ones). 




> Update jersey from 1.19 to 2.x
> --
>
>     Key: HADOOP-15984
> URL: https://issues.apache.org/jira/browse/HADOOP-15984
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15984) Update jersey from 1.19 to 2.x

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887974#comment-17887974
 ] 

ASF GitHub Bot commented on HADOOP-15984:
-

slfan1989 commented on PR #7019:
URL: https://github.com/apache/hadoop/pull/7019#issuecomment-2402623880

   @steveloughran 
   
   Thank you very much for reviewing this PR! Based on the recent compilation 
results, we have largely resolved the unit test errors, sputbug, and 
compilation issues (the latest issue seems unrelated to our changes. I 
appreciate your suggestions! I'm preparing a new version that will incorporate 
your feedback, continue to streamline some code, update the JAR package 
version, and document these changes in LICENSE-binary. I am optimistic about 
the impact of these changes on code stability, as most improvements focus on 
unit tests with minimal modifications to business logic. I hope we can smoothly 
complete this PR and merge it into both trunk and branch-3.4.




> Update jersey from 1.19 to 2.x
> --
>
> Key: HADOOP-15984
> URL: https://issues.apache.org/jira/browse/HADOOP-15984
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19291) `CombinedFileRange.merge` should not convert disjoint ranges into overlapped ones

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887941#comment-17887941
 ] 

ASF GitHub Bot commented on HADOOP-19291:
-

mukund-thakur merged PR #7101:
URL: https://github.com/apache/hadoop/pull/7101




> `CombinedFileRange.merge` should not convert disjoint ranges into overlapped 
> ones
> -
>
> Key: HADOOP-19291
> URL: https://issues.apache.org/jira/browse/HADOOP-19291
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.9, 3.5.0, 3.4.1
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2024-09-28 at 22.02.01.png
>
>
> Currently, Hadoop has a bug to convert disjoint ranges into overlapped ones 
> and eventually fails by itself.
> {code:java}
> +  public void testMergeSortedRanges() {
> +List input = asList(
> +createFileRange(13816220, 24, null),
> +createFileRange(13816244, 7423960, null)
> +);
> +assertIsNotOrderedDisjoint(input, 100, 800);
> +final List outputList = mergeSortedRanges(
> +sortRangeList(input), 100, 1001, 2500);
> +
> +assertRangeListSize(outputList, 1);
> +assertFileRange(outputList.get(0), 13816200, 7424100);
> +  }
> {code}
>  !Screenshot 2024-09-28 at 22.02.01.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19256) S3A: Support S3 Conditional Writes

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887935#comment-17887935
 ] 

ASF GitHub Bot commented on HADOOP-19256:
-

steveloughran commented on code in PR #7011:
URL: https://github.com/apache/hadoop/pull/7011#discussion_r1793509314


##
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/ITestS3APutIfMatch.java:
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FSDataOutputStreamBuilder;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.performance.AbstractS3ACostTest;
+import org.apache.hadoop.fs.s3a.RemoteFileChangedException;
+import org.apache.hadoop.fs.s3a.S3ATestUtils;
+import org.apache.hadoop.io.IOUtils;
+
+import org.assertj.core.api.Assertions;
+import org.junit.Assert;
+import org.junit.Test;
+import software.amazon.awssdk.services.s3.model.S3Exception;
+
+import java.io.IOException;
+import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset;
+import static org.apache.hadoop.fs.s3a.Constants.FAST_UPLOAD_BUFFER;
+import static org.apache.hadoop.fs.s3a.Constants.FAST_UPLOAD_BUFFER_ARRAY;
+import static 
org.apache.hadoop.fs.s3a.Constants.FS_S3A_CONDITIONAL_FILE_CREATE;
+import static org.apache.hadoop.fs.s3a.Constants.MIN_MULTIPART_THRESHOLD;
+import static org.apache.hadoop.fs.s3a.Constants.MULTIPART_MIN_SIZE;
+import static org.apache.hadoop.fs.s3a.Constants.MULTIPART_SIZE;
+import static org.apache.hadoop.fs.s3a.S3ATestUtils.skipIfNotEnabled;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides;
+import static 
org.apache.hadoop.fs.s3a.impl.InternalConstants.UPLOAD_PART_COUNT_LIMIT;
+import static 
org.apache.hadoop.fs.s3a.scale.ITestS3AMultipartUploadSizeLimits.MPU_SIZE;
+import static org.apache.hadoop.fs.s3a.scale.S3AScaleTestBase._1MB;
+
+
+public class ITestS3APutIfMatch extends AbstractS3ACostTest {
+
+private Configuration conf;
+
+@Override
+public Configuration createConfiguration() {
+Configuration conf = super.createConfiguration();
+S3ATestUtils.disableFilesystemCaching(conf);
+removeBaseAndBucketOverrides(conf,
+MULTIPART_SIZE,
+UPLOAD_PART_COUNT_LIMIT);
+conf.setLong(MULTIPART_SIZE, MPU_SIZE);
+conf.setLong(UPLOAD_PART_COUNT_LIMIT, 2);
+conf.setLong(MIN_MULTIPART_THRESHOLD, MULTIPART_MIN_SIZE);
+conf.setInt(MULTIPART_SIZE, MULTIPART_MIN_SIZE);
+conf.set(FAST_UPLOAD_BUFFER, getBlockOutputBufferName());

Review Comment:
   Just leave this alone, unless you want to do parameterized runs



##
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/ITestS3APutIfMatch.java:
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FSDataOutputStreamBuilder;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.

[jira] [Commented] (HADOOP-19258) Update AWS Java SDK to 2.27.14

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887931#comment-17887931
 ] 

ASF GitHub Bot commented on HADOOP-19258:
-

steveloughran commented on PR #7015:
URL: https://github.com/apache/hadoop/pull/7015#issuecomment-2402316731

   @diljotgrewal I've created a feature branch for this work: 
HADOOP-19256-s3-conditional-writes
   
   Can you upgrade to the latest SDK, do the usual qualification work and then 
submit a request against this branch. I don't want to do any SDK updates to 
trunk itself until I can come up with a even more robust qualification. Which I 
think will have to include auditing bits of the SDK as well to see if there are 
new warnings added. 




>  Update AWS Java SDK to 2.27.14
> ---
>
> Key: HADOOP-19258
> URL: https://issues.apache.org/jira/browse/HADOOP-19258
> Project: Hadoop Common
>  Issue Type: Task
>  Components: build, fs/s3
>Reporter: Diljot Grewal
>Priority: Major
>  Labels: pull-request-available
>
> Upgrade SDK to add IfNoneMatch support for upcoming PutIfNotExist 
> functionality
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19256) S3A: Support S3 Conditional Writes

2024-10-09 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887928#comment-17887928
 ] 

Steve Loughran commented on HADOOP-19256:
-

I've just stuck up a document on how I think we could go about adopting this.
I will start by creating that new branch for the work to go with the existing 
PR targeted for Hadoop 3.4.2

> S3A: Support S3 Conditional Writes
> --
>
> Key: HADOOP-19256
> URL: https://issues.apache.org/jira/browse/HADOOP-19256
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
>
> S3 Conditional Write (Put-if-absent) capability is now generally available - 
> [https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/]
>  
> S3A should allow passing in this put-if-absent header to prevent over writing 
> of files. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15760) Upgrade commons-collections to commons-collections4

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887924#comment-17887924
 ] 

ASF GitHub Bot commented on HADOOP-15760:
-

steveloughran commented on PR #7097:
URL: https://github.com/apache/hadoop/pull/7097#issuecomment-2402249057

   merged #7017, so this patch gets easier




> Upgrade commons-collections to commons-collections4
> ---
>
> Key: HADOOP-15760
> URL: https://issues.apache.org/jira/browse/HADOOP-15760
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.0.3
>Reporter: David Mollitor
>Assignee: Nihal Jain
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
> Attachments: HADOOP-15760.1.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Please allow for use of Apache Commons Collections 4 library with the end 
> goal of migrating from Apache Commons Collections 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19107) Drop support for HBase v1 timeline service & upgrade HBase v2

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887923#comment-17887923
 ] 

ASF GitHub Bot commented on HADOOP-19107:
-

steveloughran merged PR #7017:
URL: https://github.com/apache/hadoop/pull/7017




> Drop support for HBase v1 timeline service & upgrade HBase v2
> -
>
> Key: HADOOP-19107
> URL: https://issues.apache.org/jira/browse/HADOOP-19107
> Project: Hadoop Common
>  Issue Type: Task
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Drop support for Hbase V1 as the back end of the YARN Application Timeline 
> service, which becomes HBase 2 only.
> This does not have any effect on HBase deployments themselves.
> Dev List:
> [https://lists.apache.org/thread/vb2gh5ljwncbrmqnk0oflb8ftdz64hhs]
> https://lists.apache.org/thread/o88hnm7q8n3b4bng81q14vsj3fbhfx5w



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19107) Drop support for HBase v1 timeline service & upgrade HBase v2

2024-10-09 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19107:

Fix Version/s: 3.4.2

> Drop support for HBase v1 timeline service & upgrade HBase v2
> -
>
> Key: HADOOP-19107
> URL: https://issues.apache.org/jira/browse/HADOOP-19107
> Project: Hadoop Common
>  Issue Type: Task
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.2
>
>
> Drop support for Hbase V1 as the back end of the YARN Application Timeline 
> service, which becomes HBase 2 only.
> This does not have any effect on HBase deployments themselves.
> Dev List:
> [https://lists.apache.org/thread/vb2gh5ljwncbrmqnk0oflb8ftdz64hhs]
> https://lists.apache.org/thread/o88hnm7q8n3b4bng81q14vsj3fbhfx5w



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19107) Drop support for HBase v1 timeline service & upgrade HBase v2

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887921#comment-17887921
 ] 

ASF GitHub Bot commented on HADOOP-19107:
-

steveloughran commented on PR #7017:
URL: https://github.com/apache/hadoop/pull/7017#issuecomment-2402241059

   OK, going ahead with this. We need it for ongoing work.




> Drop support for HBase v1 timeline service & upgrade HBase v2
> -
>
> Key: HADOOP-19107
> URL: https://issues.apache.org/jira/browse/HADOOP-19107
> Project: Hadoop Common
>  Issue Type: Task
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Drop support for Hbase V1 as the back end of the YARN Application Timeline 
> service, which becomes HBase 2 only.
> This does not have any effect on HBase deployments themselves.
> Dev List:
> [https://lists.apache.org/thread/vb2gh5ljwncbrmqnk0oflb8ftdz64hhs]
> https://lists.apache.org/thread/o88hnm7q8n3b4bng81q14vsj3fbhfx5w



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19295) S3A: fs.s3a.connection.request.timeout too low for large uploads over slow links

2024-10-09 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19295.
-
Fix Version/s: 3.5.0
   3.4.1
   Resolution: Fixed

> S3A: fs.s3a.connection.request.timeout too low for large uploads over slow 
> links
> 
>
> Key: HADOOP-19295
> URL: https://issues.apache.org/jira/browse/HADOOP-19295
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.4.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> The value of {{fs.s3a.connection.request.timeout}} (default = 60s} is too low 
> for large uploads over slow connections.
> I suspect something changed between the v1 and v2 SDK versions so that put 
> was exempt from the normal timeouts, It is not and now surfaces in failures 
> to upload 1+ GB files over slower network connections. Smailer (for example 
> 128 MB) files work.
> The parallel queuing of writes in the S3ABlockOutputStream is helping create 
> this problem as it queues multiple blocks at the same time, so per-block 
> bandwidth becomes available/blocks ; four blocks cuts the capacity down by a 
> quarter.
> The fix is straightforward: use a much bigger timeout. I'm going to propose 
> 15 minutes. We need to strike a balance between upload time allocation and 
> other requests timing out.
> I do worry about other consequences; we've found that timeout exception happy 
> to hide the underlying causes of retry failures -so in fact this may be 
> better for all but a server hanging after the HTTP request is initiated.
> too bad we can't alter the timeout for different requests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19291) `CombinedFileRange.merge` should not convert disjoint ranges into overlapped ones

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887919#comment-17887919
 ] 

ASF GitHub Bot commented on HADOOP-19291:
-

steveloughran commented on PR #7079:
URL: https://github.com/apache/hadoop/pull/7079#issuecomment-2402228009

   +1 to cutting the validation in raw local; the way the nio reads are 
executed it is safe. 
   
   I should note that it only surfaces with file:// ; ChecksumFS + RawLocal, 
with a large enough file that multiple large ranges to be supplied close 
to/adjac
   
   We missed this because we were only doing small files in the unit tests and 
testing against HDFS and S3 in the full stack tests. "our tests were too good".




> `CombinedFileRange.merge` should not convert disjoint ranges into overlapped 
> ones
> -
>
> Key: HADOOP-19291
> URL: https://issues.apache.org/jira/browse/HADOOP-19291
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.9, 3.5.0, 3.4.1
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2024-09-28 at 22.02.01.png
>
>
> Currently, Hadoop has a bug to convert disjoint ranges into overlapped ones 
> and eventually fails by itself.
> {code:java}
> +  public void testMergeSortedRanges() {
> +List input = asList(
> +createFileRange(13816220, 24, null),
> +createFileRange(13816244, 7423960, null)
> +);
> +assertIsNotOrderedDisjoint(input, 100, 800);
> +final List outputList = mergeSortedRanges(
> +sortRangeList(input), 100, 1001, 2500);
> +
> +assertRangeListSize(outputList, 1);
> +assertFileRange(outputList.get(0), 13816200, 7424100);
> +  }
> {code}
>  !Screenshot 2024-09-28 at 22.02.01.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19229) Vector IO on cloud storage: what is a good minimum seek size?

2024-10-09 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887912#comment-17887912
 ] 

Steve Loughran commented on HADOOP-19229:
-

facebook velox paper on merging time: 
https://research.facebook.com/publications/velox-metas-unified-execution-engine/

-

cached columns are first read from disaggregated storage systems, such as S3 or 
HDFS, stored in RAM for the time of first use, and eventually persisted to 
local SSD. Furthermore, IO reads for nearby columns are typically coalesced 
(merged) if the gap between them is small enough (currently about 20K for SSD 
and 500K for disaggregated storage), aiming to serve neighboring reads in as 
few IO reads as possible. Naturally, this leverages the effect of temporal 
locality which makes correlated columns to be cached together on SSD.

Considering that all remote columnar formats have similar access patterns, 
consisting of first reading file metadata to identify the buffer boundaries, 
followed by read of parts of these buffers, IO reads can be scheduled in 
advance (prefetched) in order to interleave IO stalls and CPU processing. Velox 
tracks access frequencies of columns on a per-query basis, and adaptively 
schedules prefetches for hot columns. The combination of memory caching and 
smart pre-fetching logic makes many SQL interactive analytical workloads, which 
are commonly built based on small to mid-sized tables, to be effectively served 
from memory, since IO stalls are taken off of the critical path and do not 
contribute to query latency.

-


> Vector IO on cloud storage: what is a good minimum seek size?
> -
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18960) ABFS contract-tests with Hadoop-Commons intermittently failing

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887910#comment-17887910
 ] 

ASF GitHub Bot commented on HADOOP-18960:
-

hadoop-yetus commented on PR #7104:
URL: https://github.com/apache/hadoop/pull/7104#issuecomment-2402170096

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 20s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 7 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 25s |  |  trunk passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  compile  |   0m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  checkstyle  |   0m 22s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  spotbugs  |   0m 45s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m  4s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 20s |  |  the patch passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javac  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  javac  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 11s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javadoc  |   0m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  spotbugs  |   0m 42s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 46s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m  0s |  |  hadoop-azure in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 25s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  80m 39s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.47 ServerAPI=1.47 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7104/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/7104 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle |
   | uname | Linux 0db572a1f402 5.15.0-117-generic #127-Ubuntu SMP Fri Jul 5 
20:13:28 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d7552552a8015e8c3c65fa4f46e068d824d7a7e2 |
   | Default Java | Private Build-1.8.0_422-8u422-b05-1~20.04-b05 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05 
|
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7104/3/testReport/ |
   | Max. process+thread count | 552 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7104/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically gener

[jira] [Commented] (HADOOP-19106) [ABFS] All tests of. ITestAzureBlobFileSystemAuthorization fails with NPE

2024-10-09 Thread Anuj Modi (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887902#comment-17887902
 ] 

Anuj Modi commented on HADOOP-19106:


Ran a test suite with the configs that you have [~mthakur] 

Here are the results: Metric related failures are known and fixed in 
https://github.com/apache/hadoop/pull/6847

--
 AGGREGATED TEST RESULT 


HNS-SharedKey


[ERROR] 
testBackoffRetryMetrics(org.apache.hadoop.fs.azurebfs.services.TestAbfsRestOperation)
  Time elapsed: 3.822 s  <<< ERROR!
[ERROR] 
testReadFooterMetrics(org.apache.hadoop.fs.azurebfs.ITestAbfsReadFooterMetrics) 
 Time elapsed: 1.135 s  <<< ERROR!
[ERROR] 
testMetricWithIdlePeriod(org.apache.hadoop.fs.azurebfs.ITestAbfsReadFooterMetrics)
  Time elapsed: 1.187 s  <<< ERROR!
[ERROR] 
testReadFooterMetricsWithParquetAndNonParquet(org.apache.hadoop.fs.azurebfs.ITestAbfsReadFooterMetrics)
  Time elapsed: 1.174 s  <<< ERROR!

[ERROR] Tests run: 157, Failures: 0, Errors: 1, Skipped: 2
[ERROR] Tests run: 652, Failures: 0, Errors: 3, Skipped: 98
[WARNING] Tests run: 171, Failures: 0, Errors: 0, Skipped: 25
[WARNING] Tests run: 262, Failures: 0, Errors: 0, Skipped: 10

> [ABFS] All tests of. ITestAzureBlobFileSystemAuthorization fails with NPE
> -
>
> Key: HADOOP-19106
> URL: https://issues.apache.org/jira/browse/HADOOP-19106
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 3.4.0
>Reporter: Mukund Thakur
>Assignee: Anuj Modi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1
>
>
> When below config set to true all of the tests fails else it skips.
> 
>     fs.azure.test.namespace.enabled
>     true
> 
>  
> [*ERROR*] 
> testOpenFileAuthorized(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization)
>   Time elapsed: 0.064 s  <<< ERROR!
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization.runTest(ITestAzureBlobFileSystemAuthorization.java:273)
>  at 
> org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization.testOpenFileAuthorized(ITestAzureBlobFileSystemAuthorization.java:132)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18960) ABFS contract-tests with Hadoop-Commons intermittently failing

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887891#comment-17887891
 ] 

ASF GitHub Bot commented on HADOOP-18960:
-

hadoop-yetus commented on PR #7104:
URL: https://github.com/apache/hadoop/pull/7104#issuecomment-2402083104

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  48m 14s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javadoc  |   0m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  spotbugs  |   1m  8s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  41m 51s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  the patch passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javac  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 19s | 
[/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7104/1/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt)
 |  hadoop-tools/hadoop-azure: The patch generated 5 new + 11 unchanged - 0 
fixed = 16 total (was 11)  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  the patch passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  spotbugs  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  39m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 37s |  |  hadoop-azure in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 144m 46s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.47 ServerAPI=1.47 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7104/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/7104 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle |
   | uname | Linux 70f861dc583a 5.15.0-119-generic #129-Ubuntu SMP Fri Aug 2 
19:25:20 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / acca3fb677398e971d66abbcfc3749f0b3edde27 |
   | Default Java | Private Build-1.8.0_422-8u422-b05-1~20.04-b05 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05 
|
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7104/1/testReport/ |
   | Max. process+thread count | 527 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure |
   | Console ou

[jira] [Updated] (HADOOP-19229) Vector IO on cloud storage: what is a good minimum seek size?

2024-10-09 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19229:

Release Note:   (was: This is exactly what 
`"fs.s3a.vectored.read.min.seek.size" does. We set it to 4K; maybe we should 
review it. The facebook Velox paper says that 20kB is better for cloud storage)

> Vector IO on cloud storage: what is a good minimum seek size?
> -
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18960) ABFS contract-tests with Hadoop-Commons intermittently failing

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887880#comment-17887880
 ] 

ASF GitHub Bot commented on HADOOP-18960:
-

hadoop-yetus commented on PR #7104:
URL: https://github.com/apache/hadoop/pull/7104#issuecomment-2401963806

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   7m 10s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m  9s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 25s |  |  trunk passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  compile  |   0m 21s |  |  trunk passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  checkstyle  |   0m 23s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  trunk passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  spotbugs  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m  4s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javac  |   0m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  javac  |   0m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 12s | 
[/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7104/2/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt)
 |  hadoop-tools/hadoop-azure: The patch generated 5 new + 11 unchanged - 0 
fixed = 16 total (was 11)  |
   | +1 :green_heart: |  mvnsite  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  spotbugs  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m  7s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m  0s |  |  hadoop-azure in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 26s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  87m 35s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.47 ServerAPI=1.47 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7104/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/7104 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle |
   | uname | Linux 6525ace70d10 5.15.0-117-generic #127-Ubuntu SMP Fri Jul 5 
20:13:28 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / acca3fb677398e971d66abbcfc3749f0b3edde27 |
   | Default Java | Private Build-1.8.0_422-8u422-b05-1~20.04-b05 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05 
|
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7104/2/testReport/ |
   | Max. process+thread count | 561 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure |
   | Console ou

[jira] [Commented] (HADOOP-18325) ABFS: Add correlated metric support for ABFS operations

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887868#comment-17887868
 ] 

ASF GitHub Bot commented on HADOOP-18325:
-

anujmodi2021 commented on code in PR #6847:
URL: https://github.com/apache/hadoop/pull/6847#discussion_r1793211397


##
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsReadFooterMetrics.java:
##
@@ -47,6 +52,20 @@
 public class ITestAbfsReadFooterMetrics extends AbstractAbfsScaleTest {
 
   public ITestAbfsReadFooterMetrics() throws Exception {
+checkPrerequisites();
+  }
+
+  private void checkPrerequisites(){
+checkIfConfigIsSet(FS_AZURE_METRIC_ACCOUNT_NAME);

Review Comment:
   We have a method in base class already defined for this purpose. We can use 
that itself `assumeValidTestConfigPresent`





> ABFS: Add correlated metric support for ABFS operations
> ---
>
> Key: HADOOP-18325
> URL: https://issues.apache.org/jira/browse/HADOOP-18325
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.3
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Add metrics related to a particular job, specific to number of total 
> requests, retried requests, retry count and others



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18960) ABFS contract-tests with Hadoop-Commons intermittently failing

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887863#comment-17887863
 ] 

ASF GitHub Bot commented on HADOOP-18960:
-

anujmodi2021 commented on PR #7104:
URL: https://github.com/apache/hadoop/pull/7104#issuecomment-2401796321

   @steveloughran @mukund-thakur 
   Please review this PR which fixes a few tests that are known to fail due to 
some missing configs and intermittently as well.
   




> ABFS contract-tests with Hadoop-Commons intermittently failing
> --
>
> Key: HADOOP-18960
> URL: https://issues.apache.org/jira/browse/HADOOP-18960
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Pranav Saxena
>Assignee: Anuj Modi
>Priority: Minor
>  Labels: pull-request-available
>
> In the merged pr [HADOOP-18869: [ABFS] Fixing Behavior of a File System APIs 
> on root path by anujmodi2021 · Pull Request #6003 · apache/hadoop 
> (github.com)|https://github.com/apache/hadoop/pull/6003], a config was 
> switched-on: `fs.contract.test.root-tests-enabled`. This enables the root 
> manipulation tests for the filesystem contract.
> Now, the execution of contract-tests in abfs works as per executionId 
> integration-test-abfs-parallel-classes of the pom. The tests would work in 
> different jvms, and at a given instance multiple such jvms could be there, 
> depending on ${testsThreadCount}.  The problem is that all the test jvms for 
> contract-test use the same container for test runs which is defined by 
> `fs.contract.test.fs.abfs`. Due to this, one jvm root-contract-runs can 
> influence other jvm's root-contract-runs. This leads to CI failures for 
> hadoop-azure package.
> Solution is to run these tests sequentially and separate from other 
> commit/manifest tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18960) ABFS contract-tests with Hadoop-Commons intermittently failing

2024-10-09 Thread Anuj Modi (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj Modi updated HADOOP-18960:
---
Labels: pull-request-available  (was: )

> ABFS contract-tests with Hadoop-Commons intermittently failing
> --
>
> Key: HADOOP-18960
> URL: https://issues.apache.org/jira/browse/HADOOP-18960
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Pranav Saxena
>Assignee: Anuj Modi
>Priority: Minor
>  Labels: pull-request-available
>
> In the merged pr [HADOOP-18869: [ABFS] Fixing Behavior of a File System APIs 
> on root path by anujmodi2021 · Pull Request #6003 · apache/hadoop 
> (github.com)|https://github.com/apache/hadoop/pull/6003], a config was 
> switched-on: `fs.contract.test.root-tests-enabled`. This enables the root 
> manipulation tests for the filesystem contract.
> Now, the execution of contract-tests in abfs works as per executionId 
> integration-test-abfs-parallel-classes of the pom. The tests would work in 
> different jvms, and at a given instance multiple such jvms could be there, 
> depending on ${testsThreadCount}.  The problem is that all the test jvms for 
> contract-test use the same container for test runs which is defined by 
> `fs.contract.test.fs.abfs`. Due to this, one jvm root-contract-runs can 
> influence other jvm's root-contract-runs. This leads to CI failures for 
> hadoop-azure package.
> Solution is to run these tests sequentially and separate from other 
> commit/manifest tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19106) [ABFS] All tests of. ITestAzureBlobFileSystemAuthorization fails with NPE

2024-10-09 Thread Anuj Modi (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887858#comment-17887858
 ] 

Anuj Modi commented on HADOOP-19106:


Hi [~mthakur] 
Sorry for the delay. I finally got some time to work on these issues.
Have created a PR for the fixes: [HADOOP-18960: [ABFS] Making Contract tests 
run in sequential and Other Test Fixes by anujmodi2021 · Pull Request #7104 · 
apache/hadoop (github.com)|https://github.com/apache/hadoop/pull/7104]

> [ABFS] All tests of. ITestAzureBlobFileSystemAuthorization fails with NPE
> -
>
> Key: HADOOP-19106
> URL: https://issues.apache.org/jira/browse/HADOOP-19106
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 3.4.0
>Reporter: Mukund Thakur
>Assignee: Anuj Modi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1
>
>
> When below config set to true all of the tests fails else it skips.
> 
>     fs.azure.test.namespace.enabled
>     true
> 
>  
> [*ERROR*] 
> testOpenFileAuthorized(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization)
>   Time elapsed: 0.064 s  <<< ERROR!
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization.runTest(ITestAzureBlobFileSystemAuthorization.java:273)
>  at 
> org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization.testOpenFileAuthorized(ITestAzureBlobFileSystemAuthorization.java:132)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19291) `CombinedFileRange.merge` should not convert disjoint ranges into overlapped ones

2024-10-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887837#comment-17887837
 ] 

ASF GitHub Bot commented on HADOOP-19291:
-

hadoop-yetus commented on PR #7101:
URL: https://github.com/apache/hadoop/pull/7101#issuecomment-2401737490

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  50m 58s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m  2s |  |  trunk passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  compile  |  19m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  checkstyle  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 46s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  trunk passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  spotbugs  |   2m 42s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  41m 26s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 58s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 10s |  |  the patch passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javac  |  20m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 52s |  |  the patch passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  javac  |  19m 52s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 45s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 12s |  |  the patch passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  the patch passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  spotbugs  |   2m 50s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  42m 18s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  20m 21s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  4s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 255m 15s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.47 ServerAPI=1.47 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7101/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/7101 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint 
xmllint |
   | uname | Linux 770cb0a1c896 5.15.0-119-generic #129-Ubuntu SMP Fri Aug 2 
19:25:20 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f6c7bda1de5bb326fff0c9dc0185114c90c459c0 |
   | Default Java | Private Build-1.8.0_422-8u422-b05-1~20.04-b05 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05 
|
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7101/2/testReport/ |
   | Max. process+thread count | 1263 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7101/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4

[jira] [Updated] (HADOOP-18960) ABFS contract-tests with Hadoop-Commons intermittently failing

2024-10-08 Thread Anuj Modi (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj Modi updated HADOOP-18960:
---
Description: 
In the merged pr [HADOOP-18869: [ABFS] Fixing Behavior of a File System APIs on 
root path by anujmodi2021 · Pull Request #6003 · apache/hadoop 
(github.com)|https://github.com/apache/hadoop/pull/6003], a config was 
switched-on: `fs.contract.test.root-tests-enabled`. This enables the root 
manipulation tests for the filesystem contract.

Now, the execution of contract-tests in abfs works as per executionId 
integration-test-abfs-parallel-classes of the pom. The tests would work in 
different jvms, and at a given instance multiple such jvms could be there, 
depending on ${testsThreadCount}.  The problem is that all the test jvms for 
contract-test use the same container for test runs which is defined by 
`fs.contract.test.fs.abfs`. Due to this, one jvm root-contract-runs can 
influence other jvm's root-contract-runs. This leads to CI failures for 
hadoop-azure package.

Solution is to run these tests sequentially and separate from other 
commit/manifest tests.

  was:
In the merged pr [HADOOP-18869: [ABFS] Fixing Behavior of a File System APIs on 
root path by anujmodi2021 · Pull Request #6003 · apache/hadoop 
(github.com)|https://github.com/apache/hadoop/pull/6003], a config was 
switched-on: `fs.contract.test.root-tests-enabled`. This enables the root 
manipulation tests for the filesystem contract.

Now, the execution of contract-tests in abfs works as per executionId 
integration-test-abfs-parallel-classes of the pom. The tests would work in 
different jvms, and at a given instance multiple such jvms could be there, 
depending on ${testsThreadCount}.  The problem is that all the test jvms for 
contract-test use the same container for test runs which is defined by 
`fs.contract.test.fs.abfs`. Due to this, one jvm root-contract-runs can 
influence other jvm's root-contract-runs. This leads to CI failures for 
hadoop-azure package.


> ABFS contract-tests with Hadoop-Commons intermittently failing
> --
>
> Key: HADOOP-18960
> URL: https://issues.apache.org/jira/browse/HADOOP-18960
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Pranav Saxena
>Priority: Minor
>
> In the merged pr [HADOOP-18869: [ABFS] Fixing Behavior of a File System APIs 
> on root path by anujmodi2021 · Pull Request #6003 · apache/hadoop 
> (github.com)|https://github.com/apache/hadoop/pull/6003], a config was 
> switched-on: `fs.contract.test.root-tests-enabled`. This enables the root 
> manipulation tests for the filesystem contract.
> Now, the execution of contract-tests in abfs works as per executionId 
> integration-test-abfs-parallel-classes of the pom. The tests would work in 
> different jvms, and at a given instance multiple such jvms could be there, 
> depending on ${testsThreadCount}.  The problem is that all the test jvms for 
> contract-test use the same container for test runs which is defined by 
> `fs.contract.test.fs.abfs`. Due to this, one jvm root-contract-runs can 
> influence other jvm's root-contract-runs. This leads to CI failures for 
> hadoop-azure package.
> Solution is to run these tests sequentially and separate from other 
> commit/manifest tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18960) ABFS contract-tests with Hadoop-Commons intermittently failing

2024-10-08 Thread Anuj Modi (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj Modi reassigned HADOOP-18960:
--

Assignee: Anuj Modi

> ABFS contract-tests with Hadoop-Commons intermittently failing
> --
>
> Key: HADOOP-18960
> URL: https://issues.apache.org/jira/browse/HADOOP-18960
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Pranav Saxena
>Assignee: Anuj Modi
>Priority: Minor
>
> In the merged pr [HADOOP-18869: [ABFS] Fixing Behavior of a File System APIs 
> on root path by anujmodi2021 · Pull Request #6003 · apache/hadoop 
> (github.com)|https://github.com/apache/hadoop/pull/6003], a config was 
> switched-on: `fs.contract.test.root-tests-enabled`. This enables the root 
> manipulation tests for the filesystem contract.
> Now, the execution of contract-tests in abfs works as per executionId 
> integration-test-abfs-parallel-classes of the pom. The tests would work in 
> different jvms, and at a given instance multiple such jvms could be there, 
> depending on ${testsThreadCount}.  The problem is that all the test jvms for 
> contract-test use the same container for test runs which is defined by 
> `fs.contract.test.fs.abfs`. Due to this, one jvm root-contract-runs can 
> influence other jvm's root-contract-runs. This leads to CI failures for 
> hadoop-azure package.
> Solution is to run these tests sequentially and separate from other 
> commit/manifest tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19304) [JDK 17] Resolve Http Server error and Http response error in Hadoop Trunk

2024-10-08 Thread Shilun Fan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887785#comment-17887785
 ] 

Shilun Fan commented on HADOOP-19304:
-

Thank you very much for your attention to JDK17. That's great! I will help 
review pr, but I may not merge it immediately. I will submit the final version 
of HADOOP-15984 for review as soon as possible. I hope the impact on 
HADOOP-15984 can be minimized because I've spent a lot of time just to get it 
to compile successfully.

> [JDK 17] Resolve Http Server error and Http response error in Hadoop Trunk
> --
>
> Key: HADOOP-19304
> URL: https://issues.apache.org/jira/browse/HADOOP-19304
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Muskan Mishra
>Assignee: Muskan Mishra
>Priority: Major
>
> While compiling HADOOP-TRUNK on JDK 17 faced 2 common issues : 
> *1.* Unexpected HTTP response: *code=500 != 200* or *code=500 != 307*
> *2.* org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.webapp.WebAppException: *Error starting http server*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19304) [JDK 17] Resolve Http Server error and Http response error in Hadoop Trunk

2024-10-08 Thread Muskan Mishra (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Muskan Mishra updated HADOOP-19304:
---
Parent: HADOOP-17177
Issue Type: Sub-task  (was: Task)

> [JDK 17] Resolve Http Server error and Http response error in Hadoop Trunk
> --
>
> Key: HADOOP-19304
> URL: https://issues.apache.org/jira/browse/HADOOP-19304
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Muskan Mishra
>Assignee: Muskan Mishra
>Priority: Major
>
> While compiling HADOOP-TRUNK on JDK 17 faced 2 common issues : 
> *1.* Unexpected HTTP response: *code=500 != 200* or *code=500 != 307*
> *2.* org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.webapp.WebAppException: *Error starting http server*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19304) [JDK 17] Resolve Http Server error and Http response error in Hadoop Trunk

2024-10-08 Thread Muskan Mishra (Jira)
Muskan Mishra created HADOOP-19304:
--

 Summary: [JDK 17] Resolve Http Server error and Http response 
error in Hadoop Trunk
 Key: HADOOP-19304
 URL: https://issues.apache.org/jira/browse/HADOOP-19304
 Project: Hadoop Common
  Issue Type: Task
Reporter: Muskan Mishra
Assignee: Muskan Mishra


While compiling HADOOP-TRUNK on JDK 17 faced 2 common issues : 

*1.* Unexpected HTTP response: *code=500 != 200* or *code=500 != 307*

*2.* org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
org.apache.hadoop.yarn.webapp.WebAppException: *Error starting http server*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19291) `CombinedFileRange.merge` should not convert disjoint ranges into overlapped ones

2024-10-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887769#comment-17887769
 ] 

ASF GitHub Bot commented on HADOOP-19291:
-

mukund-thakur commented on PR #7101:
URL: https://github.com/apache/hadoop/pull/7101#issuecomment-2401334291

   cc @dongjoon-hyun 




> `CombinedFileRange.merge` should not convert disjoint ranges into overlapped 
> ones
> -
>
> Key: HADOOP-19291
> URL: https://issues.apache.org/jira/browse/HADOOP-19291
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.9, 3.5.0, 3.4.1
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2024-09-28 at 22.02.01.png
>
>
> Currently, Hadoop has a bug to convert disjoint ranges into overlapped ones 
> and eventually fails by itself.
> {code:java}
> +  public void testMergeSortedRanges() {
> +List input = asList(
> +createFileRange(13816220, 24, null),
> +createFileRange(13816244, 7423960, null)
> +);
> +assertIsNotOrderedDisjoint(input, 100, 800);
> +final List outputList = mergeSortedRanges(
> +sortRangeList(input), 100, 1001, 2500);
> +
> +assertRangeListSize(outputList, 1);
> +assertFileRange(outputList.get(0), 13816200, 7424100);
> +  }
> {code}
>  !Screenshot 2024-09-28 at 22.02.01.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19229) Vector IO on cloud storage: what is a good minimum seek size?

2024-10-08 Thread Mukund Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887574#comment-17887574
 ] 

Mukund Thakur edited comment on HADOOP-19229 at 10/9/24 5:12 AM:
-

This is exactly what `"fs.s3a.vectored.read.min.seek.size" does. We set it to 
4K; maybe we should review it. The facebook Velox paper says that 20kB is 
better for cloud storage


was (Author: ste...@apache.org):
This is exactly what `"fs.s3a.vectored.read.min.seek.size" does. We set it to 
4K; maybe we should review it. The facebook Velox paper says that 20kB is 
better for cloud storage

> Vector IO on cloud storage: what is a good minimum seek size?
> -
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19291) `CombinedFileRange.merge` should not convert disjoint ranges into overlapped ones

2024-10-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887683#comment-17887683
 ] 

ASF GitHub Bot commented on HADOOP-19291:
-

hadoop-yetus commented on PR #7101:
URL: https://github.com/apache/hadoop/pull/7101#issuecomment-2400625449

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  18m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  48m 37s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m 37s |  |  trunk passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  compile  |  19m 57s |  |  trunk passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  checkstyle  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 47s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  spotbugs  |   2m 41s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  42m 16s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 59s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m  9s |  |  the patch passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javac  |  20m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  javac  |  20m  6s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7101/1/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK 
Private Build-1.8.0_422-8u422-b05-1~20.04-b05  |
   | +1 :green_heart: |  spotbugs  |   2m 49s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  42m 54s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  20m 14s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  3s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 272m 26s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.47 ServerAPI=1.47 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7101/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/7101 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint 
xmllint |
   | uname | Linux d61c8cfa2b09 5.15.0-119-generic #129-Ubuntu SMP Fri Aug 2 
19:25:20 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / dacf85d14fa341f9cdb1116352743d0f8ed3f69f |
   | Default Java | Private Build-1.8.0_422-8u422-b05-1~20.04-b05 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05 
|
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7101/1/testReport/ |
   | Max. process+thread count | 3137 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common 

[jira] [Updated] (HADOOP-19295) S3A: fs.s3a.connection.request.timeout too low for large uploads over slow links

2024-10-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HADOOP-19295:

Labels: pull-request-available  (was: )

> S3A: fs.s3a.connection.request.timeout too low for large uploads over slow 
> links
> 
>
> Key: HADOOP-19295
> URL: https://issues.apache.org/jira/browse/HADOOP-19295
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.4.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> The value of {{fs.s3a.connection.request.timeout}} (default = 60s} is too low 
> for large uploads over slow connections.
> I suspect something changed between the v1 and v2 SDK versions so that put 
> was exempt from the normal timeouts, It is not and now surfaces in failures 
> to upload 1+ GB files over slower network connections. Smailer (for example 
> 128 MB) files work.
> The parallel queuing of writes in the S3ABlockOutputStream is helping create 
> this problem as it queues multiple blocks at the same time, so per-block 
> bandwidth becomes available/blocks ; four blocks cuts the capacity down by a 
> quarter.
> The fix is straightforward: use a much bigger timeout. I'm going to propose 
> 15 minutes. We need to strike a balance between upload time allocation and 
> other requests timing out.
> I do worry about other consequences; we've found that timeout exception happy 
> to hide the underlying causes of retry failures -so in fact this may be 
> better for all but a server hanging after the HTTP request is initiated.
> too bad we can't alter the timeout for different requests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19295) S3A: fs.s3a.connection.request.timeout too low for large uploads over slow links

2024-10-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887670#comment-17887670
 ] 

ASF GitHub Bot commented on HADOOP-19295:
-

steveloughran merged PR #7100:
URL: https://github.com/apache/hadoop/pull/7100




> S3A: fs.s3a.connection.request.timeout too low for large uploads over slow 
> links
> 
>
> Key: HADOOP-19295
> URL: https://issues.apache.org/jira/browse/HADOOP-19295
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.4.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> The value of {{fs.s3a.connection.request.timeout}} (default = 60s} is too low 
> for large uploads over slow connections.
> I suspect something changed between the v1 and v2 SDK versions so that put 
> was exempt from the normal timeouts, It is not and now surfaces in failures 
> to upload 1+ GB files over slower network connections. Smailer (for example 
> 128 MB) files work.
> The parallel queuing of writes in the S3ABlockOutputStream is helping create 
> this problem as it queues multiple blocks at the same time, so per-block 
> bandwidth becomes available/blocks ; four blocks cuts the capacity down by a 
> quarter.
> The fix is straightforward: use a much bigger timeout. I'm going to propose 
> 15 minutes. We need to strike a balance between upload time allocation and 
> other requests timing out.
> I do worry about other consequences; we've found that timeout exception happy 
> to hide the underlying causes of retry failures -so in fact this may be 
> better for all but a server hanging after the HTTP request is initiated.
> too bad we can't alter the timeout for different requests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18487) Make protobuf 2.5 an optional runtime dependency.

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18487:

Fix Version/s: 3.4.0
   (was: 3.4.1)

> Make protobuf 2.5 an optional runtime dependency.
> -
>
> Key: HADOOP-18487
> URL: https://issues.apache.org/jira/browse/HADOOP-18487
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, ipc
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9, 3.5.0
>
>
> uses of protobuf 2.5 and RpcEnginej have been deprecated since 3.3.0 in 
> HADOOP-17046
> while still keeping those files around (for a long time...), how about we 
> make the protobuf 2.5.0 export off hadoop common and hadoop-hdfs *provided*, 
> rather than *compile*
> that way, if apps want it for their own apis, they have to explicitly ask for 
> it, but at least our own scans don't break.
> i have no idea what will happen to the rest of the stack at this point, it 
> will be "interesting" to see



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19229) Vector IO on cloud storage: what is a good minimum seek size?

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19229:

Summary: Vector IO on cloud storage: what is a good minimum seek size?  
(was: Vector IO on cloud storage: what is a good minimum seek size should be)

> Vector IO on cloud storage: what is a good minimum seek size?
> -
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18487) Make protobuf 2.5 an optional runtime dependency.

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18487:

Fix Version/s: 3.5.0
   (was: 3.4.0)

> Make protobuf 2.5 an optional runtime dependency.
> -
>
> Key: HADOOP-18487
> URL: https://issues.apache.org/jira/browse/HADOOP-18487
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, ipc
>Affects Versions: 3.3.4
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> uses of protobuf 2.5 and RpcEnginej have been deprecated since 3.3.0 in 
> HADOOP-17046
> while still keeping those files around (for a long time...), how about we 
> make the protobuf 2.5.0 export off hadoop common and hadoop-hdfs *provided*, 
> rather than *compile*
> that way, if apps want it for their own apis, they have to explicitly ask for 
> it, but at least our own scans don't break.
> i have no idea what will happen to the rest of the stack at this point, it 
> will be "interesting" to see



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19083) provide hadoop binary tarball without aws v2 sdk

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19083.
-
Fix Version/s: 3.4.1
   Resolution: Fixed

> provide hadoop binary tarball without aws v2 sdk
> 
>
> Key: HADOOP-19083
> URL: https://issues.apache.org/jira/browse/HADOOP-19083
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1
>
>
> Have the default hadoop binary .tar.gz exclude the aws v2 sdk by default. 
> This SDK brings the total size of the distribution to about 1 GB.
> Proposed
> * add a profile to include the aws sdk in the dist module
> * document it for local building
> * for release builds, we modify our release ant builds to generate modified 
> x86 and arm64 releases without the file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19286) Support S3A cross region access when S3 region/endpoint is set

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19286:

Fix Version/s: 3.4.2

> Support S3A cross region access when S3 region/endpoint is set
> --
>
> Key: HADOOP-19286
> URL: https://issues.apache.org/jira/browse/HADOOP-19286
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.2
>
>
> Currently when S3 region nor endpoint is set, the default region is set to 
> us-east-2 with cross region access enabled. But when region or endpoint is 
> set, cross region access is not enabled.
> The proposal here is to carves out cross region access as a separate config 
> and enable/disable it irrespective of region/endpoint is set. This gives more 
> flexibility to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19286) Support S3A cross region access when S3 region/endpoint is set

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19286:

Description: 
Currently when S3 region nor endpoint is set, the default region is set to 
us-east-2 with cross region access enabled. But when region or endpoint is set, 
cross region access is not enabled.

A new option enables cross region access as a separate config and 
enable/disables it
irrespective of region/endpoint is set.
   s3a.cross.region.access.enabled

default: enables cross region access as a separate config and enable/disables it
irrespective of region/endpoint is set.



  was:
Currently when S3 region nor endpoint is set, the default region is set to 
us-east-2 with cross region access enabled. But when region or endpoint is set, 
cross region access is not enabled.

The proposal here is to carves out cross region access as a separate config and 
enable/disable it irrespective of region/endpoint is set. This gives more 
flexibility to the user.


> Support S3A cross region access when S3 region/endpoint is set
> --
>
> Key: HADOOP-19286
> URL: https://issues.apache.org/jira/browse/HADOOP-19286
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.2
>
>
> Currently when S3 region nor endpoint is set, the default region is set to 
> us-east-2 with cross region access enabled. But when region or endpoint is 
> set, cross region access is not enabled.
> A new option enables cross region access as a separate config and 
> enable/disables it
> irrespective of region/endpoint is set.
>s3a.cross.region.access.enabled
> default: enables cross region access as a separate config and enable/disables 
> it
> irrespective of region/endpoint is set.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19286) Support S3A cross region access when S3 region/endpoint is set

2024-10-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887659#comment-17887659
 ] 

ASF GitHub Bot commented on HADOOP-19286:
-

steveloughran commented on PR #7093:
URL: https://github.com/apache/hadoop/pull/7093#issuecomment-2400377010

   done. I'm not going to put into 3.4.1 as I'm only putting critical fixes 
*or* things which need to go in their first. Then we can start to focus on 3.4.2




> Support S3A cross region access when S3 region/endpoint is set
> --
>
> Key: HADOOP-19286
> URL: https://issues.apache.org/jira/browse/HADOOP-19286
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Currently when S3 region nor endpoint is set, the default region is set to 
> us-east-2 with cross region access enabled. But when region or endpoint is 
> set, cross region access is not enabled.
> The proposal here is to carves out cross region access as a separate config 
> and enable/disable it irrespective of region/endpoint is set. This gives more 
> flexibility to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19286) Support S3A cross region access when S3 region/endpoint is set

2024-10-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887658#comment-17887658
 ] 

ASF GitHub Bot commented on HADOOP-19286:
-

steveloughran merged PR #7093:
URL: https://github.com/apache/hadoop/pull/7093




> Support S3A cross region access when S3 region/endpoint is set
> --
>
> Key: HADOOP-19286
> URL: https://issues.apache.org/jira/browse/HADOOP-19286
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Currently when S3 region nor endpoint is set, the default region is set to 
> us-east-2 with cross region access enabled. But when region or endpoint is 
> set, cross region access is not enabled.
> The proposal here is to carves out cross region access as a separate config 
> and enable/disable it irrespective of region/endpoint is set. This gives more 
> flexibility to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19229) Vector IO on cloud storage: what is a good minimum seek size should be

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19229:

Summary: Vector IO on cloud storage: what is a good minimum seek size 
should be  (was: Vector IO on cloud storage: experiment to see what a good 
minimum seek size should be)

> Vector IO on cloud storage: what is a good minimum seek size should be
> --
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19291) `CombinedFileRange.merge` should not convert disjoint ranges into overlapped ones

2024-10-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887613#comment-17887613
 ] 

ASF GitHub Bot commented on HADOOP-19291:
-

mukund-thakur opened a new pull request, #7101:
URL: https://github.com/apache/hadoop/pull/7101

   ChecksumFileSystem creates the chunked ranges based on the checksum chunk 
size and then calls readVectored on Raw Local which may lead to overlapping 
ranges in some cases.
   
   
   
   ### Description of PR
   
   
   ### How was this patch tested?
   
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> `CombinedFileRange.merge` should not convert disjoint ranges into overlapped 
> ones
> -
>
> Key: HADOOP-19291
> URL: https://issues.apache.org/jira/browse/HADOOP-19291
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.9, 3.5.0, 3.4.1
>Reporter: Dongjoon Hyun
>Priority: Major
> Attachments: Screenshot 2024-09-28 at 22.02.01.png
>
>
> Currently, Hadoop has a bug to convert disjoint ranges into overlapped ones 
> and eventually fails by itself.
> {code:java}
> +  public void testMergeSortedRanges() {
> +List input = asList(
> +createFileRange(13816220, 24, null),
> +createFileRange(13816244, 7423960, null)
> +);
> +assertIsNotOrderedDisjoint(input, 100, 800);
> +final List outputList = mergeSortedRanges(
> +sortRangeList(input), 100, 1001, 2500);
> +
> +assertRangeListSize(outputList, 1);
> +assertFileRange(outputList.get(0), 13816200, 7424100);
> +  }
> {code}
>  !Screenshot 2024-09-28 at 22.02.01.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19291) `CombinedFileRange.merge` should not convert disjoint ranges into overlapped ones

2024-10-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HADOOP-19291:

Labels: pull-request-available  (was: )

> `CombinedFileRange.merge` should not convert disjoint ranges into overlapped 
> ones
> -
>
> Key: HADOOP-19291
> URL: https://issues.apache.org/jira/browse/HADOOP-19291
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.9, 3.5.0, 3.4.1
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2024-09-28 at 22.02.01.png
>
>
> Currently, Hadoop has a bug to convert disjoint ranges into overlapped ones 
> and eventually fails by itself.
> {code:java}
> +  public void testMergeSortedRanges() {
> +List input = asList(
> +createFileRange(13816220, 24, null),
> +createFileRange(13816244, 7423960, null)
> +);
> +assertIsNotOrderedDisjoint(input, 100, 800);
> +final List outputList = mergeSortedRanges(
> +sortRangeList(input), 100, 1001, 2500);
> +
> +assertRangeListSize(outputList, 1);
> +assertFileRange(outputList.get(0), 13816200, 7424100);
> +  }
> {code}
>  !Screenshot 2024-09-28 at 22.02.01.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19291) `CombinedFileRange.merge` should not convert disjoint ranges into overlapped ones

2024-10-08 Thread Mukund Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887611#comment-17887611
 ] 

Mukund Thakur commented on HADOOP-19291:


ChecksumFileSystem creates the chunked ranges based on the checksum chunk size 
and then calls
readVectored on Raw Local which may lead to overlapping ranges even if initial 
ranges are disjoint in some cases. 
For more details see comments in [https://github.com/apache/hadoop/pull/7079] 
 

> `CombinedFileRange.merge` should not convert disjoint ranges into overlapped 
> ones
> -
>
> Key: HADOOP-19291
> URL: https://issues.apache.org/jira/browse/HADOOP-19291
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.9, 3.5.0, 3.4.1
>Reporter: Dongjoon Hyun
>Priority: Major
> Attachments: Screenshot 2024-09-28 at 22.02.01.png
>
>
> Currently, Hadoop has a bug to convert disjoint ranges into overlapped ones 
> and eventually fails by itself.
> {code:java}
> +  public void testMergeSortedRanges() {
> +List input = asList(
> +createFileRange(13816220, 24, null),
> +createFileRange(13816244, 7423960, null)
> +);
> +assertIsNotOrderedDisjoint(input, 100, 800);
> +final List outputList = mergeSortedRanges(
> +sortRangeList(input), 100, 1001, 2500);
> +
> +assertRangeListSize(outputList, 1);
> +assertFileRange(outputList.get(0), 13816200, 7424100);
> +  }
> {code}
>  !Screenshot 2024-09-28 at 22.02.01.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19291) `CombinedFileRange.merge` should not convert disjoint ranges into overlapped ones

2024-10-08 Thread Mukund Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887611#comment-17887611
 ] 

Mukund Thakur edited comment on HADOOP-19291 at 10/8/24 2:37 PM:
-

ChecksumFileSystem creates the chunked ranges based on the checksum chunk size 
and then calls
readVectored on Raw Local which may lead to overlapping ranges even if initial 
ranges are disjoint in some cases. 

Example :

Range1: 30918-14251143

Range2:  14251143-2958570

If checksum size is 100, checksum ranges become 

30900 - 14251200

14251100 - 2958600

which are overlapping. 

For more details see comments in [https://github.com/apache/hadoop/pull/7079] 
 


was (Author: mthakur):
ChecksumFileSystem creates the chunked ranges based on the checksum chunk size 
and then calls
readVectored on Raw Local which may lead to overlapping ranges even if initial 
ranges are disjoint in some cases. 
For more details see comments in [https://github.com/apache/hadoop/pull/7079] 
 

> `CombinedFileRange.merge` should not convert disjoint ranges into overlapped 
> ones
> -
>
> Key: HADOOP-19291
> URL: https://issues.apache.org/jira/browse/HADOOP-19291
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.9, 3.5.0, 3.4.1
>Reporter: Dongjoon Hyun
>Priority: Major
> Attachments: Screenshot 2024-09-28 at 22.02.01.png
>
>
> Currently, Hadoop has a bug to convert disjoint ranges into overlapped ones 
> and eventually fails by itself.
> {code:java}
> +  public void testMergeSortedRanges() {
> +List input = asList(
> +createFileRange(13816220, 24, null),
> +createFileRange(13816244, 7423960, null)
> +);
> +assertIsNotOrderedDisjoint(input, 100, 800);
> +final List outputList = mergeSortedRanges(
> +sortRangeList(input), 100, 1001, 2500);
> +
> +assertRangeListSize(outputList, 1);
> +assertFileRange(outputList.get(0), 13816200, 7424100);
> +  }
> {code}
>  !Screenshot 2024-09-28 at 22.02.01.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19229) Vector IO on cloud storage: experiment to see what a good minimum seek size should be

2024-10-08 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887574#comment-17887574
 ] 

Steve Loughran commented on HADOOP-19229:
-

This is exactly what `"fs.s3a.vectored.read.min.seek.size" does. We set it to 
4K; maybe we should review it. The facebook Velox paper says that 20kB is 
better for cloud storage

> Vector IO on cloud storage: experiment to see what a good minimum seek size 
> should be
> -
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19229) Vector IO on cloud storage: experiment to see what a good minimum seek size should be

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19229:

Summary: Vector IO on cloud storage: experiment to see what a good minimum 
seek size should be  (was: Vector IO: have a max distance between ranges to 
read)

> Vector IO on cloud storage: experiment to see what a good minimum seek size 
> should be
> -
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-19229) Vector IO on cloud storage: experiment to see what a good minimum seek size should be

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reopened HADOOP-19229:
-

> Vector IO on cloud storage: experiment to see what a good minimum seek size 
> should be
> -
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19229) Vector IO: have a max distance between ranges to read

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19229.
-
Release Note: This is exactly what `"fs.s3a.vectored.read.min.seek.size" 
does. We set it to 4K; maybe we should review it. The facebook Velox paper says 
that 20kB is better for cloud storage
  Resolution: Fixed

> Vector IO: have a max distance between ranges to read
> -
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19303) VectorIO API to take a ByteBufferPool rather than just an allocator method

2024-10-08 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19303:
---

 Summary: VectorIO API to take a ByteBufferPool rather than just an 
allocator method
 Key: HADOOP-19303
 URL: https://issues.apache.org/jira/browse/HADOOP-19303
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs, fs/s3
Affects Versions: 3.4.1
Reporter: Steve Loughran


extend for vector IO API with a method that takes a ByteBufferPool 
implementation rather than just an allocator. This allows for buffers to be 
returned to the pool when problems occur, before throwing an exception.

The Parquet API is already designed for this



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19105) S3A: Recover from Vector IO read failures

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19105:

Description: 
s3a vector IO doesn't try to recover from read failures the way read() does.

Need to
* abort HTTP stream if considered needed
* retry active read which failed
* but not those which had succeeded

On a full failure we need to do something about any allocated buffer, which 
means we really need the buffer pool {{ByteBufferPool}} to return or also 
provide a "release" (Bytebuffer -> void) call which does the return.  we would 
need to
* add this as a new api with the implementations in s3a, local, rawlocal
* classic single allocator method remaps to the new one with (() -> null) as 
the response
Environment: 


This keeps the public API stable



  was:
s3a vector IO doesn't try to recover from read failures the way read() does.

Need to
* abort HTTP stream if considered needed
* retry active read which failed
* but not those which had succeeded

On a full failure we need to do something about any allocated buffer, which 
means we really need the buffer pool {{ByteBufferPool}} to return or also 
provide a "release" (Bytebuffer -> void) call which does the return.  we would 
need to
* add this as a new api with the implementations in s3a, local, rawlocal
* classic single allocator method remaps to the new one with (() -> null) as 
the response

This keeps the public API stable




> S3A: Recover from Vector IO read failures
> -
>
> Key: HADOOP-19105
> URL: https://issues.apache.org/jira/browse/HADOOP-19105
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: This keeps the public API stable
>Reporter: Steve Loughran
>Priority: Major
>
> s3a vector IO doesn't try to recover from read failures the way read() does.
> Need to
> * abort HTTP stream if considered needed
> * retry active read which failed
> * but not those which had succeeded
> On a full failure we need to do something about any allocated buffer, which 
> means we really need the buffer pool {{ByteBufferPool}} to return or also 
> provide a "release" (Bytebuffer -> void) call which does the return.  we 
> would need to
> * add this as a new api with the implementations in s3a, local, rawlocal
> * classic single allocator method remaps to the new one with (() -> null) as 
> the response



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19229) Vector IO: have a max distance between ranges to read

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19229:

Description: 
vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
between ranges to justify the merge. Right now we could have a read where two 
vectors of size 8 bytes can be merged with a 1 MB gap between them -and that's 
wasteful. 

We could also consider an "efficiency" metric which looks at the ratio of 
bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
track it as an IOStat

  was:vector iO has a max size to coalesce ranges, but it also needs a maximum 
gap between ranges to justify the merge.


> Vector IO: have a max distance between ranges to read
> -
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge. Right now we could have a read where two 
> vectors of size 8 bytes can be merged with a 1 MB gap between them -and 
> that's wasteful. 
> We could also consider an "efficiency" metric which looks at the ratio of 
> bytes-read to bytes-discarded. Not sure what we'd do with it, but we could 
> track it as an IOStat



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19229) Vector IO: have a max distance between ranges to read

2024-10-08 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19229:

Summary: Vector IO: have a max distance between ranges to read  (was: 
Vector IO: have a max distance between ranges to range)

> Vector IO: have a max distance between ranges to read
> -
>
> Key: HADOOP-19229
> URL: https://issues.apache.org/jira/browse/HADOOP-19229
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> vector iO has a max size to coalesce ranges, but it also needs a maximum gap 
> between ranges to justify the merge.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19299) ConcurrentModificationException in HttpReferrerAuditHeader

2024-10-07 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19299.
-
Fix Version/s: 3.4.1
 Assignee: Steve Loughran
   Resolution: Fixed

> ConcurrentModificationException in HttpReferrerAuditHeader
> --
>
> Key: HADOOP-19299
> URL: https://issues.apache.org/jira/browse/HADOOP-19299
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.4.1
>
>
> Surfaced during a test run doing vector iO, where multiple parallel GETs were 
> being issued within the same audit span, just when the header is built by 
> enumerating the attributes.
> {code}
>   queries = attributes.entrySet().stream()
>   .filter(e -> !filter.contains(e.getKey()))
>   .map(e -> e.getKey() + "=" + e.getValue())
>   .collect(Collectors.joining("&"));
> {code}
> Hypothesis: multiple GET requests are conflicting in updating/reading the 
> header.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19299) ConcurrentModificationException in HttpReferrerAuditHeader

2024-10-07 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19299:

Fix Version/s: 3.5.0

> ConcurrentModificationException in HttpReferrerAuditHeader
> --
>
> Key: HADOOP-19299
> URL: https://issues.apache.org/jira/browse/HADOOP-19299
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.5.0, 3.4.1
>
>
> Surfaced during a test run doing vector iO, where multiple parallel GETs were 
> being issued within the same audit span, just when the header is built by 
> enumerating the attributes.
> {code}
>   queries = attributes.entrySet().stream()
>   .filter(e -> !filter.contains(e.getKey()))
>   .map(e -> e.getKey() + "=" + e.getValue())
>   .collect(Collectors.joining("&"));
> {code}
> Hypothesis: multiple GET requests are conflicting in updating/reading the 
> header.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18308) Update to Apache LDAP API 2.0.x

2024-10-07 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18308:

Fix Version/s: 3.5.0
   3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Update to Apache LDAP API 2.0.x
> ---
>
> Key: HADOOP-18308
> URL: https://issues.apache.org/jira/browse/HADOOP-18308
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.3
>Reporter: Colm O hEigeartaigh
>Assignee: Colm O hEigeartaigh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Update from Apache LDAP API 1.x to 2.0.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18308) Update to Apache LDAP API 2.0.x

2024-10-07 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887350#comment-17887350
 ] 

Steve Loughran commented on HADOOP-18308:
-

well, don't know why i got mentioned here, but so it has. declaring as done for 
3.4.0

> Update to Apache LDAP API 2.0.x
> ---
>
> Key: HADOOP-18308
> URL: https://issues.apache.org/jira/browse/HADOOP-18308
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.3
>Reporter: Colm O hEigeartaigh
>Assignee: Colm O hEigeartaigh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Update from Apache LDAP API 1.x to 2.0.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19299) ConcurrentModificationException in HttpReferrerAuditHeader

2024-10-07 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887330#comment-17887330
 ] 

Steve Loughran commented on HADOOP-19299:
-

no need, pr is up

> ConcurrentModificationException in HttpReferrerAuditHeader
> --
>
> Key: HADOOP-19299
> URL: https://issues.apache.org/jira/browse/HADOOP-19299
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> Surfaced during a test run doing vector iO, where multiple parallel GETs were 
> being issued within the same audit span, just when the header is built by 
> enumerating the attributes.
> {code}
>   queries = attributes.entrySet().stream()
>   .filter(e -> !filter.contains(e.getKey()))
>   .map(e -> e.getKey() + "=" + e.getValue())
>   .collect(Collectors.joining("&"));
> {code}
> Hypothesis: multiple GET requests are conflicting in updating/reading the 
> header.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19302) Update rat version in the docker build.sh script

2024-10-04 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HADOOP-19302:
-
Description: 
The docker build.sh script uses apache rat 0.15 which is removed from Apache 
CDN.
https://github.com/apache/hadoop/blob/docker-hadoop-3.4/build.sh#L20

The build in the DockerHub doesn't fail, probably because there's cache. But it 
fails for me locally.

The latest is 0.16.1. Let's update.

  was:
The docker build.sh script uses apache rat 0.15 which is removed from Apache 
CDN.
https://github.com/apache/hadoop/blob/docker-hadoop-3.4/build.sh#L20

The build in the DockerHub doesn't fail, probably because there's cache. But I 
don't download it locally.

The latest is 0.16.1. Let's update.


> Update rat version in the docker build.sh script
> 
>
> Key: HADOOP-19302
>     URL: https://issues.apache.org/jira/browse/HADOOP-19302
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.3.7, 3.4.1
>Reporter: Wei-Chiu Chuang
>Priority: Major
>
> The docker build.sh script uses apache rat 0.15 which is removed from Apache 
> CDN.
> https://github.com/apache/hadoop/blob/docker-hadoop-3.4/build.sh#L20
> The build in the DockerHub doesn't fail, probably because there's cache. But 
> it fails for me locally.
> The latest is 0.16.1. Let's update.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19302) Update rat version in the docker build.sh script

2024-10-04 Thread PJ Fanning (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886996#comment-17886996
 ] 

PJ Fanning edited comment on HADOOP-19302 at 10/4/24 6:09 PM:
--

Please use https://archive.apache.org/dist/creadur/ instead of 
https://dlcdn.apache.org/creadur

ASF Infra encourage projects to remove old releases from dlcdn.apache.org but 
all releases get automatically archived. Using the archive copy means that you 
don't have to worry about it being removed,

I would suggest a better solution is to use the Maven Plugin which relies on 
Maven Central.

https://creadur.apache.org/rat/apache-rat-plugin/






was (Author: fanningpj):
Please use https://archive.apache.org/dist/creadur/ instead of 
https://dlcdn.apache.org/creadur

ASF Infra encourage projects to remove old releases from dlcdn.apache.org but 
all releases get automatically archived.

I would suggest a better solution is to use the Maven Plugin which relies on 
Maven Central.

https://creadur.apache.org/rat/apache-rat-plugin/





> Update rat version in the docker build.sh script
> 
>
> Key: HADOOP-19302
> URL: https://issues.apache.org/jira/browse/HADOOP-19302
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.3.7, 3.4.1
>Reporter: Wei-Chiu Chuang
>Priority: Major
>
> The docker build.sh script uses apache rat 0.15 which is removed from Apache 
> CDN.
> https://github.com/apache/hadoop/blob/docker-hadoop-3.4/build.sh#L20
> The build in the DockerHub doesn't fail, probably because there's cache. But 
> I don't download it locally.
> The latest is 0.16.1. Let's update.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19302) Update rat version in the docker build.sh script

2024-10-04 Thread PJ Fanning (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886996#comment-17886996
 ] 

PJ Fanning commented on HADOOP-19302:
-

Please use https://archive.apache.org/dist/creadur/ instead of 
https://dlcdn.apache.org/creadur

ASF Infra encourage projects to remove old releases from dlcdn.apache.org but 
all releases get automatically archived.

I would suggest a better solution is to use the Maven Plugin which relies on 
Maven Central.

https://creadur.apache.org/rat/apache-rat-plugin/





> Update rat version in the docker build.sh script
> 
>
> Key: HADOOP-19302
> URL: https://issues.apache.org/jira/browse/HADOOP-19302
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.3.7, 3.4.1
>Reporter: Wei-Chiu Chuang
>Priority: Major
>
> The docker build.sh script uses apache rat 0.15 which is removed from Apache 
> CDN.
> https://github.com/apache/hadoop/blob/docker-hadoop-3.4/build.sh#L20
> The build in the DockerHub doesn't fail, probably because there's cache. But 
> I don't download it locally.
> The latest is 0.16.1. Let's update.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19302) Update rat version in the docker build.sh script

2024-10-04 Thread Wei-Chiu Chuang (Jira)
Wei-Chiu Chuang created HADOOP-19302:


 Summary: Update rat version in the docker build.sh script
 Key: HADOOP-19302
 URL: https://issues.apache.org/jira/browse/HADOOP-19302
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.3.7, 3.4.1
Reporter: Wei-Chiu Chuang


The docker build.sh script uses apache rat 0.15 which is removed from Apache 
CDN.
https://github.com/apache/hadoop/blob/docker-hadoop-3.4/build.sh#L20

The build in the DockerHub doesn't fail, probably because there's cache. But I 
don't download it locally.

The latest is 0.16.1. Let's update.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19132) Upgrade hadoop3 docker scripts to 3.4.x

2024-10-04 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886985#comment-17886985
 ] 

Wei-Chiu Chuang commented on HADOOP-19132:
--

Just pushed the 3.4.0 tag.

Thanks [~adoroszlai] for the tips. I am not afraid to admit I'm newbie to 
Docker :)

> Upgrade hadoop3 docker scripts to 3.4.x
> ---
>
> Key: HADOOP-19132
> URL: https://issues.apache.org/jira/browse/HADOOP-19132
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19286) Support S3A cross region access when S3 region/endpoint is set

2024-10-04 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19286.
-
Resolution: Fixed

> Support S3A cross region access when S3 region/endpoint is set
> --
>
> Key: HADOOP-19286
> URL: https://issues.apache.org/jira/browse/HADOOP-19286
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Currently when S3 region nor endpoint is set, the default region is set to 
> us-east-2 with cross region access enabled. But when region or endpoint is 
> set, cross region access is not enabled.
> The proposal here is to carves out cross region access as a separate config 
> and enable/disable it irrespective of region/endpoint is set. This gives more 
> flexibility to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19254) Implement bulk delete command as hadoop fs command operation

2024-10-04 Thread Mukund Thakur (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886846#comment-17886846
 ] 

Mukund Thakur commented on HADOOP-19254:


+1 on the bulk option in rm! Sounds good to me [~harshit.gupta]  

> Implement bulk delete command as hadoop fs command operation 
> -
>
> Key: HADOOP-19254
> URL: https://issues.apache.org/jira/browse/HADOOP-19254
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.4.1
>Reporter: Mukund Thakur
>Assignee: Harshit Gupta
>Priority: Major
>
> {code}
> hadoop fs -bulkdelete   
> {code}
> Key uses
> * QE: Testing from python and other scripting languages
> * cluster maintenance: actual bulk deletion operations from the store
> one thought there: we MUST qualify paths with / elements: if a passed in path 
> ends in /, it means "delete a marker", not "delete a dir"'. and if it doesn't 
> have one then it's an object.. This makes it possible to be used to delete 
> surplus markers or where there is a file above another file...cloudstore 
> listobjects finds this



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19132) Upgrade hadoop3 docker scripts to 3.4.x

2024-10-03 Thread Attila Doroszlai (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886840#comment-17886840
 ] 

Attila Doroszlai commented on HADOOP-19132:
---

Thanks [~weichiu] for publishing the image for Hadoop 3.4.0.  I noticed two 
issues:

* tag 
[3.4.1|https://hub.docker.com/layers/apache/hadoop/3.4.1/images/sha256-ed8b2a9a6c78f5b5d9e96500890e3b1a9635452a35ccbc021a5676befd9b23de?context=explore]
 was used accidentally, should be 3.4.0
* [tags|https://hub.docker.com/r/apache/hadoop/tags] 3.4 and 3.4.1 point to 
different images, which requires Docker to download/store 1.1GB separately for 
each

There is no need to rebuild the image when publishing different tags.  New tag 
can be added to existing image:

{code}
# add 3.4.0 as new tag
docker tag apache/hadoop:3.4 apache/hadoop:3.4.0
docker push apache/hadoop:3.4.0
{code}

> Upgrade hadoop3 docker scripts to 3.4.x
> ---
>
> Key: HADOOP-19132
> URL: https://issues.apache.org/jira/browse/HADOOP-19132
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Moved] (HADOOP-19301) MutableQuantiles.getQuantiles() should be made a static method

2024-10-03 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang moved HDFS-17635 to HADOOP-19301:
-

Key: HADOOP-19301  (was: HDFS-17635)
Project: Hadoop Common  (was: Hadoop HDFS)

> MutableQuantiles.getQuantiles() should be made a static method
> --
>
> Key: HADOOP-19301
> URL: https://issues.apache.org/jira/browse/HADOOP-19301
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Priority: Trivial
>
> MutableQuantiles.getQuantiles() returns the static member variable QUANTILES, 
> so this method should be a static method too.
> https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableQuantiles.java#L157



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19286) Support S3A cross region access when S3 region/endpoint is set

2024-10-03 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886705#comment-17886705
 ] 

Syed Shameerur Rahman commented on HADOOP-19286:


[~ste...@apache.org]  - I have created a followup PR: 
https://github.com/apache/hadoop/pull/7098

> Support S3A cross region access when S3 region/endpoint is set
> --
>
> Key: HADOOP-19286
> URL: https://issues.apache.org/jira/browse/HADOOP-19286
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Currently when S3 region nor endpoint is set, the default region is set to 
> us-east-2 with cross region access enabled. But when region or endpoint is 
> set, cross region access is not enabled.
> The proposal here is to carves out cross region access as a separate config 
> and enable/disable it irrespective of region/endpoint is set. This gives more 
> flexibility to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19296) [JDK17] Upgrade maven-war-plugin to 3.4.0

2024-10-03 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HADOOP-19296:

  Component/s: build
   common
 Target Version/s: 3.5.0, 3.4.2
Affects Version/s: 3.4.0
   3.4.1

> [JDK17] Upgrade maven-war-plugin to 3.4.0
> -
>
> Key: HADOOP-19296
> URL: https://issues.apache.org/jira/browse/HADOOP-19296
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, common
>Affects Versions: 3.4.0, 3.4.1
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 3.5.0, 3.4.2
>
>
> During the process of compiling JDK17, we needed a higher version, which has 
> already been successfully applied in our internal builds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19296) [JDK17] Upgrade maven-war-plugin to 3.4.0

2024-10-03 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan resolved HADOOP-19296.
-
Fix Version/s: 3.5.0
   3.4.2
 Hadoop Flags: Reviewed
   Resolution: Fixed

> [JDK17] Upgrade maven-war-plugin to 3.4.0
> -
>
> Key: HADOOP-19296
> URL: https://issues.apache.org/jira/browse/HADOOP-19296
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
> Fix For: 3.5.0, 3.4.2
>
>
> During the process of compiling JDK17, we needed a higher version, which has 
> already been successfully applied in our internal builds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-19286) Support S3A cross region access when S3 region/endpoint is set

2024-10-03 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reopened HADOOP-19286:
-

new tests fails for me;  i have a bucket specific region

{code}
[ERROR] 
testWithCrossRegionAccess(org.apache.hadoop.fs.s3a.ITestS3AEndpointRegion)  
Time elapsed: 1.199 s  <<< ERROR!
org.apache.hadoop.fs.s3a.AWSBadRequestException: getFileStatus on 
s3a://stevel-london/user/stevel: 
software.amazon.awssdk.services.s3.model.S3Exception: null (Service: S3, Status 
Code: 400, Request ID: PANJ0H4G5Z7XDN8K, Extended Request ID: 
0jVye0vK5JIuXLPN2fC3TpqYx/bi5r9Fuk7KdahorhdUJ0IGT/ca392MCjYABvq7IfLMwG/P+7Y=):null:
 null (Service: S3, Status Code: 400, Request ID: PANJ0H4G5Z7XDN8K, Extended 
Request ID: 
0jVye0vK5JIuXLPN2fC3TpqYx/bi5r9Fuk7KdahorhdUJ0IGT/ca392MCjYABvq7IfLMwG/P+7Y=)
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:262)
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:157)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4099)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:4005)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$exists$33(S3AFileSystem.java:5007)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2863)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2882)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.exists(S3AFileSystem.java:5005)
at 
org.apache.hadoop.fs.s3a.ITestS3AEndpointRegion.testWithCrossRegionAccess(ITestS3AEndpointRegion.java:384)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750)
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: 
S3, Status Code: 400, Request ID: PANJ0H4G5Z7XDN8K, Extended Request ID: 
0jVye0vK5JIuXLPN2fC3TpqYx/bi5r9Fuk7KdahorhdUJ0IGT/ca392MCjYABvq7IfLMwG/P+7Y=)
at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93)
at 
software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:50)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:38)
at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
 

[jira] [Updated] (HADOOP-19221) S3A: Unable to recover from failure of multipart block upload attempt "Status Code: 400; Error Code: RequestTimeout"

2024-10-03 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19221:

Fix Version/s: 3.4.1
   (was: 3.4.2)

> S3A: Unable to recover from failure of multipart block upload attempt "Status 
> Code: 400; Error Code: RequestTimeout"
> 
>
> Key: HADOOP-19221
> URL: https://issues.apache.org/jira/browse/HADOOP-19221
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> If a multipart PUT request fails for some reason (e.g. networrk error) then 
> all subsequent retry attempts fail with a 400 Response and ErrorCode 
> RequestTimeout .
> {code}
> Your socket connection to the server was not read from or written to within 
> the timeout period. Idle connections will be closed. (Service: Amazon S3; 
> Status Code: 400; Error Code: RequestTimeout; Request ID:; S3 Extended 
> Request ID:
> {code}
> The list of supporessed exceptions contains the root cause (the initial 
> failure was a 500); all retries failed to upload properly from the source 
> input stream {{RequestBody.fromInputStream(fileStream, size)}}.
> Hypothesis: the mark/reset stuff doesn't work for input streams. On the v1 
> sdk we would build a multipart block upload request passing in (file, offset, 
> length), the way we are now doing this doesn't recover.
> probably fixable by providing our own {{ContentStreamProvider}} 
> implementations for
> # file + offset + length
> # bytebuffer
> # byte array
> The sdk does have explicit support for the memory ones, but they copy the 
> data blocks first. we don't want that as it would double the memory 
> requirements of active blocks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19221) S3A: Unable to recover from failure of multipart block upload attempt "Status Code: 400; Error Code: RequestTimeout"

2024-10-03 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19221:

Release Note: S3A upload operations can now recover from failures where the 
store returns a 500 error. There is an option to control whether or not the S3A 
client itself attempts to retry on a 50x error other than 503 throttling events 
(which are independently processed as before). Option: 
fs.s3a.retry.http.5xx.errors . Default: true

> S3A: Unable to recover from failure of multipart block upload attempt "Status 
> Code: 400; Error Code: RequestTimeout"
> 
>
> Key: HADOOP-19221
> URL: https://issues.apache.org/jira/browse/HADOOP-19221
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.2
>
>
> If a multipart PUT request fails for some reason (e.g. networrk error) then 
> all subsequent retry attempts fail with a 400 Response and ErrorCode 
> RequestTimeout .
> {code}
> Your socket connection to the server was not read from or written to within 
> the timeout period. Idle connections will be closed. (Service: Amazon S3; 
> Status Code: 400; Error Code: RequestTimeout; Request ID:; S3 Extended 
> Request ID:
> {code}
> The list of supporessed exceptions contains the root cause (the initial 
> failure was a 500); all retries failed to upload properly from the source 
> input stream {{RequestBody.fromInputStream(fileStream, size)}}.
> Hypothesis: the mark/reset stuff doesn't work for input streams. On the v1 
> sdk we would build a multipart block upload request passing in (file, offset, 
> length), the way we are now doing this doesn't recover.
> probably fixable by providing our own {{ContentStreamProvider}} 
> implementations for
> # file + offset + length
> # bytebuffer
> # byte array
> The sdk does have explicit support for the memory ones, but they copy the 
> data blocks first. we don't want that as it would double the memory 
> requirements of active blocks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15760) Upgrade commons-collections to commons-collections4

2024-10-03 Thread Nihal Jain (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886624#comment-17886624
 ] 

Nihal Jain commented on HADOOP-15760:
-

Raised backport for branch-3.4 at https://github.com/apache/hadoop/pull/7097

> Upgrade commons-collections to commons-collections4
> ---
>
> Key: HADOOP-15760
> URL: https://issues.apache.org/jira/browse/HADOOP-15760
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.0.3
>Reporter: David Mollitor
>Assignee: Nihal Jain
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
> Attachments: HADOOP-15760.1.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Please allow for use of Apache Commons Collections 4 library with the end 
> goal of migrating from Apache Commons Collections 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19299) ConcurrentModificationException in HttpReferrerAuditHeader

2024-10-03 Thread Harshit Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886602#comment-17886602
 ] 

Harshit Gupta commented on HADOOP-19299:


Hi [~ste...@apache.org] , can you give some repro steps for this and maybe I 
can pick it up? 

> ConcurrentModificationException in HttpReferrerAuditHeader
> --
>
> Key: HADOOP-19299
> URL: https://issues.apache.org/jira/browse/HADOOP-19299
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> Surfaced during a test run doing vector iO, where multiple parallel GETs were 
> being issued within the same audit span, just when the header is built by 
> enumerating the attributes.
> {code}
>   queries = attributes.entrySet().stream()
>   .filter(e -> !filter.contains(e.getKey()))
>   .map(e -> e.getKey() + "=" + e.getValue())
>   .collect(Collectors.joining("&"));
> {code}
> Hypothesis: multiple GET requests are conflicting in updating/reading the 
> header.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18308) Update to Apache LDAP API 2.0.x

2024-10-03 Thread Nihal Jain (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886585#comment-17886585
 ] 

Nihal Jain commented on HADOOP-18308:
-

Hi [~ste...@apache.org] this PR was merged quite some time back, but seems its 
still not marked as resolved. Just FYI!

> Update to Apache LDAP API 2.0.x
> ---
>
> Key: HADOOP-18308
> URL: https://issues.apache.org/jira/browse/HADOOP-18308
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.3
>Reporter: Colm O hEigeartaigh
>Assignee: Colm O hEigeartaigh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Update from Apache LDAP API 1.x to 2.0.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19219) Resolve Certificate error in Hadoop-auth tests.

2024-10-03 Thread Muskan Mishra (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Muskan Mishra resolved HADOOP-19219.

Resolution: Fixed

> Resolve Certificate error in Hadoop-auth tests.
> ---
>
> Key: HADOOP-19219
> URL: https://issues.apache.org/jira/browse/HADOOP-19219
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Muskan Mishra
>Assignee: Muskan Mishra
>Priority: Major
>  Labels: pull-request-available
>
> While compiling Hadoop-Trunk with JDK17, faced following errors in 
> TestMultiSchemeAuthenticationHandler and 
> TestLdapAuthenticationHandler classes.
> {code:java}
> [INFO] Running 
> org.apache.hadoop.security.authentication.server.TestMultiSchemeAuthenticationHandler
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.256 
> s <<< FAILURE! - in 
> org.apache.hadoop.security.authentication.server.TestMultiSchemeAuthenticationHandler
> [ERROR] 
> org.apache.hadoop.security.authentication.server.TestMultiSchemeAuthenticationHandler
>   Time elapsed: 1.255 s  <<< ERROR!
> java.lang.IllegalAccessError: class 
> org.apache.directory.server.core.security.CertificateUtil (in unnamed module 
> @0x32e614e9) cannot access class sun.security.x509.X500Name (in module 
> java.base) because module java.base does not export sun.security.x509 to 
> unnamed module @0x32e614e9
> at 
> org.apache.directory.server.core.security.CertificateUtil.createTempKeyStore(CertificateUtil.java:334)
> at 
> org.apache.directory.server.factory.ServerAnnotationProcessor.instantiateLdapServer(ServerAnnotationProcessor.java:158)
> at 
> org.apache.directory.server.factory.ServerAnnotationProcessor.createLdapServer(ServerAnnotationProcessor.java:318)
> at 
> org.apache.directory.server.factory.ServerAnnotationProcessor.createLdapServer(ServerAnnotationProcessor.java:351)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19219) Resolve Certificate error in Hadoop-auth tests.

2024-10-03 Thread Muskan Mishra (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886584#comment-17886584
 ] 

Muskan Mishra commented on HADOOP-19219:


Closing this Jira as PR [https://github.com/apache/hadoop/pull/7084] raised by 
[~chengpan], associated with existing PR: 
[#6939|https://github.com/apache/hadoop/pull/6939] is merged now.

> Resolve Certificate error in Hadoop-auth tests.
> ---
>
> Key: HADOOP-19219
> URL: https://issues.apache.org/jira/browse/HADOOP-19219
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Muskan Mishra
>Assignee: Muskan Mishra
>Priority: Major
>  Labels: pull-request-available
>
> While compiling Hadoop-Trunk with JDK17, faced following errors in 
> TestMultiSchemeAuthenticationHandler and 
> TestLdapAuthenticationHandler classes.
> {code:java}
> [INFO] Running 
> org.apache.hadoop.security.authentication.server.TestMultiSchemeAuthenticationHandler
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.256 
> s <<< FAILURE! - in 
> org.apache.hadoop.security.authentication.server.TestMultiSchemeAuthenticationHandler
> [ERROR] 
> org.apache.hadoop.security.authentication.server.TestMultiSchemeAuthenticationHandler
>   Time elapsed: 1.255 s  <<< ERROR!
> java.lang.IllegalAccessError: class 
> org.apache.directory.server.core.security.CertificateUtil (in unnamed module 
> @0x32e614e9) cannot access class sun.security.x509.X500Name (in module 
> java.base) because module java.base does not export sun.security.x509 to 
> unnamed module @0x32e614e9
> at 
> org.apache.directory.server.core.security.CertificateUtil.createTempKeyStore(CertificateUtil.java:334)
> at 
> org.apache.directory.server.factory.ServerAnnotationProcessor.instantiateLdapServer(ServerAnnotationProcessor.java:158)
> at 
> org.apache.directory.server.factory.ServerAnnotationProcessor.createLdapServer(ServerAnnotationProcessor.java:318)
> at 
> org.apache.directory.server.factory.ServerAnnotationProcessor.createLdapServer(ServerAnnotationProcessor.java:351)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19299) ConcurrentModificationException in HttpReferrerAuditHeader

2024-10-02 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886445#comment-17886445
 ] 

Steve Loughran commented on HADOOP-19299:
-

{code}
.s3a.ITestS3AContractVectoredRead
[ERROR] testSomeRangesMergedSomeUnmerged[Buffer type : 
array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
elapsed: 0.905 s  <<< ERROR!
java.util.ConcurrentModificationException
at 
java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1728)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at 
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
at 
org.apache.hadoop.fs.store.audit.HttpReferrerAuditHeader.buildHttpReferrer(HttpReferrerAuditHeader.java:182)
at 
org.apache.hadoop.fs.s3a.audit.impl.LoggingAuditor$LoggingAuditSpan.modifyHttpRequest(LoggingAuditor.java:388)
at 
org.apache.hadoop.fs.s3a.audit.impl.ActiveAuditManagerS3A$WrappingAuditSpan.modifyHttpRequest(ActiveAuditManagerS3A.java:871)
at 
org.apache.hadoop.fs.s3a.audit.impl.ActiveAuditManagerS3A.modifyHttpRequest(ActiveAuditManagerS3A.java:612)
at 
software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.modifyHttpRequestAndHttpContent(ExecutionInterceptorChain.java:89)
at 
software.amazon.awssdk.core.internal.handler.BaseClientHandler.runModifyHttpRequestAndHttpContentInterceptors(BaseClientHandler.java:157)
at 
software.amazon.awssdk.core.internal.handler.BaseClientHandler.finalizeSdkHttpFullRequest(BaseClientHandler.java:83)
at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:151)
at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$0(BaseSyncClientHandler.java:66)
at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:60)
at 
software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:52)
at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:60)
at 
software.amazon.awssdk.services.s3.DefaultS3Client.getObject(DefaultS3Client.java:5174)
at 
software.amazon.awssdk.services.s3.S3Client.getObject(S3Client.java:9005)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1934)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.lambda$getS3Object$7(S3AInputStream.java:1223)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.getS3Object(S3AInputStream.java:1220)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.getS3ObjectInputStream(S3AInputStream.java:1117)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.readCombinedRangeAndUpdateChildren(S3AInputStream.java:963)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.lambda$readVectored$5(S3AInputStream.java:945)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{code}


> ConcurrentModificationException in HttpReferrerAuditHeader
> --
>
> Key: HADOOP-19299
> URL: https://issues.apache.org/jira/browse/HADOOP-19299
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Priority: Major
>
> Surfaced during a test ru

[jira] [Created] (HADOOP-19299) ConcurrentModificationException in HttpReferrerAuditHeader

2024-10-02 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19299:
---

 Summary: ConcurrentModificationException in HttpReferrerAuditHeader
 Key: HADOOP-19299
 URL: https://issues.apache.org/jira/browse/HADOOP-19299
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: 3.4.1
Reporter: Steve Loughran


Surfaced during a test run doing vector iO, where multiple parallel GETs were 
being issued within the same audit span, just when the header is built by 
enumerating the attributes.

{code}
  queries = attributes.entrySet().stream()
  .filter(e -> !filter.contains(e.getKey()))
  .map(e -> e.getKey() + "=" + e.getValue())
  .collect(Collectors.joining("&"));
{code}

Hypothesis: multiple GET requests are conflicting in updating/reading the 
header.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19293) Avoid Subject.getSubject method on newer JVMs

2024-10-02 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19293.
-
Resolution: Duplicate

see discussion on HADOOP-19212

> Avoid Subject.getSubject method on newer JVMs
> -
>
> Key: HADOOP-19293
> URL: https://issues.apache.org/jira/browse/HADOOP-19293
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: auth, common
>Reporter: Justin
>Assignee: Justin
>Priority: Major
>
> In Java 23, Subject.getSubject requires setting the system property 
> java.security.manager to allow, else it will throw an exception. More detail 
> is available in the release notes: https://jdk.java.net/23/release-notes
> This is in support of the eventual removal of the security manager, at which 
> point, Subject.getSubject will be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17895) RawLocalFileSystem cannot mkdir/chmod paths with emojis. ☹️

2024-10-01 Thread Wilson M Penha Jr. (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886263#comment-17886263
 ] 

Wilson M Penha Jr. commented on HADOOP-17895:
-

I am having this same issue with hadoop-3.1.1 when running 
hbase-snapshot-exporter

2024-10-01 19:40:42,253 ERROR [main] snapshot.ExportSnapshot: Snapshot export 
failed
ENOENT: No such file or directory
    at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:233)
    at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:836)
    at 
org.apache.hadoop.fs.ChecksumFileSystem$1.apply(ChecksumFileSystem.java:508)
    at 
org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:489)
    at 
org.apache.hadoop.fs.ChecksumFileSystem.setPermission(ChecksumFileSystem.java:511)
    at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:676)

I am running out of options here, did you figure this out yet?

 

> RawLocalFileSystem cannot mkdir/chmod paths with emojis. ☹️
> ---
>
> Key: HADOOP-17895
> URL: https://issues.apache.org/jira/browse/HADOOP-17895
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common, fs
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> *Bug description:*
> `fs.mkdirs` command for `RawLocalFileSystem` doesn't work in Hadoop 3 with 
> NativeIO enabled.
> The failure was happening when doing the native `chmod` command to the file 
> (the `mkdir` command itself is working).
> Stacktrace:
> {{ENOENT: No such file or directory  ENOENT: No such file or directory at 
> org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native Method) at 
> org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.java:382) at 
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:974)
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:660)
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:700)
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:672)}}
>  
> *To reproduce:*
>  * Add `fs.mkdirs` in RawLocalFileSystem with NativeIO enabled.
>  * Sample: [https://github.com/apache/hadoop/pull/3391]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-19292) BlockDecompressorStream#rawReadInt wastes about 1% of overall CPU cycles creating new EOFException

2024-10-01 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned HADOOP-19292:


Assignee: Benoit Sigoure

> BlockDecompressorStream#rawReadInt wastes about 1% of overall CPU cycles 
> creating new EOFException
> --
>
> Key: HADOOP-19292
> URL: https://issues.apache.org/jira/browse/HADOOP-19292
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: compress, io
>Affects Versions: 3.3.6
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
>Priority: Major
> Attachments: 
> HADOOP-19292-Don-t-create-new-EOFException-in-BlockD.patch
>
>
> On our HBase clusters, while looking at CPU profiles, I noticed that about 1% 
> of overall CPU cycles are spent under BlockDecompressorStream#rawReadInt just 
> throwing EOFException. This could be easily avoided.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19286) Support S3A cross region access when S3 region/endpoint is set

2024-10-01 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19286.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

> Support S3A cross region access when S3 region/endpoint is set
> --
>
> Key: HADOOP-19286
> URL: https://issues.apache.org/jira/browse/HADOOP-19286
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Currently when S3 region nor endpoint is set, the default region is set to 
> us-east-2 with cross region access enabled. But when region or endpoint is 
> set, cross region access is not enabled.
> The proposal here is to carves out cross region access as a separate config 
> and enable/disable it irrespective of region/endpoint is set. This gives more 
> flexibility to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19292) BlockDecompressorStream#rawReadInt wastes about 1% of overall CPU cycles creating new EOFException

2024-10-01 Thread Benoit Sigoure (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886255#comment-17886255
 ] 

Benoit Sigoure commented on HADOOP-19292:
-

Hi Steve, done: https://github.com/apache/hadoop/pull/7090

> BlockDecompressorStream#rawReadInt wastes about 1% of overall CPU cycles 
> creating new EOFException
> --
>
> Key: HADOOP-19292
> URL: https://issues.apache.org/jira/browse/HADOOP-19292
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: compress, io
>Affects Versions: 3.3.6
>Reporter: Benoit Sigoure
>Priority: Major
> Attachments: 
> HADOOP-19292-Don-t-create-new-EOFException-in-BlockD.patch
>
>
> On our HBase clusters, while looking at CPU profiles, I noticed that about 1% 
> of overall CPU cycles are spent under BlockDecompressorStream#rawReadInt just 
> throwing EOFException. This could be easily avoided.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19288) hadoop-client-runtime exclude dnsjava InetAddressResolverProvider

2024-10-01 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19288.
-
Fix Version/s: 3.5.0
   3.4.1
   Resolution: Fixed

> hadoop-client-runtime exclude dnsjava InetAddressResolverProvider
> -
>
> Key: HADOOP-19288
> URL: https://issues.apache.org/jira/browse/HADOOP-19288
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> [https://github.com/dnsjava/dnsjava/issues/338]
>  
> {code:java}
> Exception in thread "main" java.util.ServiceConfigurationError: 
> java.net.spi.InetAddressResolverProvider: Provider 
> org.apache.hadoop.shaded.org.xbill.DNS.spi.DnsjavaInetAddressResolverProvider 
> not found
>     at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:593)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.nextProviderClass(ServiceLoader.java:1219)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1228)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1273)
>     at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1309)
>     at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1393)
>     at java.base/java.util.ServiceLoader.findFirst(ServiceLoader.java:1812)
>     at java.base/java.net.InetAddress.loadResolver(InetAddress.java:508)
>     at java.base/java.net.InetAddress.resolver(InetAddress.java:488)
>     at 
> java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1826)
>     at 
> java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:1139)
>     at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1818)
>     at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1931)
>     at 
> org.apache.logging.log4j.core.util.NetUtils.getLocalHostname(NetUtils.java:56)
>     at 
> org.apache.logging.log4j.core.LoggerContext.lambda$setConfiguration$0(LoggerContext.java:625)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19290) Operating on / in ChecksumFileSystem throws NPE

2024-10-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886230#comment-17886230
 ] 

Ayush Saxena commented on HADOOP-19290:
---

c-picked to 3.4

> Operating on / in ChecksumFileSystem throws NPE
> ---
>
> Key: HADOOP-19290
> URL: https://issues.apache.org/jira/browse/HADOOP-19290
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Operating on / on ChecksumFileSystem throws NPE
> {noformat}
> java.lang.NullPointerException
>   at org.apache.hadoop.fs.Path.(Path.java:151)
>   at org.apache.hadoop.fs.Path.(Path.java:130)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.getChecksumFile(ChecksumFileSystem.java:121)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:774)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.setReplication(ChecksumFileSystem.java:884)
> {noformat}
> Internally I observed it for SetPermission but on my Mac LocalFs doesn't let 
> me setPermission on "/", so I reproduced it via SetReplication which goes 
> through the same code path



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19290) Operating on / in ChecksumFileSystem throws NPE

2024-10-01 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HADOOP-19290:
--
Fix Version/s: 3.4.2

> Operating on / in ChecksumFileSystem throws NPE
> ---
>
> Key: HADOOP-19290
> URL: https://issues.apache.org/jira/browse/HADOOP-19290
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.2
>
>
> Operating on / on ChecksumFileSystem throws NPE
> {noformat}
> java.lang.NullPointerException
>   at org.apache.hadoop.fs.Path.(Path.java:151)
>   at org.apache.hadoop.fs.Path.(Path.java:130)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.getChecksumFile(ChecksumFileSystem.java:121)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:774)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.setReplication(ChecksumFileSystem.java:884)
> {noformat}
> Internally I observed it for SetPermission but on my Mac LocalFs doesn't let 
> me setPermission on "/", so I reproduced it via SetReplication which goes 
> through the same code path



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19290) Operating on / in ChecksumFileSystem throws NPE

2024-10-01 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886227#comment-17886227
 ] 

Steve Loughran commented on HADOOP-19290:
-

you going to backport to 3.4.0?

> Operating on / in ChecksumFileSystem throws NPE
> ---
>
> Key: HADOOP-19290
> URL: https://issues.apache.org/jira/browse/HADOOP-19290
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Operating on / on ChecksumFileSystem throws NPE
> {noformat}
> java.lang.NullPointerException
>   at org.apache.hadoop.fs.Path.(Path.java:151)
>   at org.apache.hadoop.fs.Path.(Path.java:130)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.getChecksumFile(ChecksumFileSystem.java:121)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:774)
>   at 
> org.apache.hadoop.fs.ChecksumFileSystem.setReplication(ChecksumFileSystem.java:884)
> {noformat}
> Internally I observed it for SetPermission but on my Mac LocalFs doesn't let 
> me setPermission on "/", so I reproduced it via SetReplication which goes 
> through the same code path



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19292) BlockDecompressorStream#rawReadInt wastes about 1% of overall CPU cycles creating new EOFException

2024-10-01 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886226#comment-17886226
 ] 

Steve Loughran commented on HADOOP-19292:
-

thanks. could you submit as a github PR against trunk?

> BlockDecompressorStream#rawReadInt wastes about 1% of overall CPU cycles 
> creating new EOFException
> --
>
> Key: HADOOP-19292
> URL: https://issues.apache.org/jira/browse/HADOOP-19292
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: compress, io
>Affects Versions: 3.3.6
>Reporter: Benoit Sigoure
>Priority: Major
> Attachments: 
> HADOOP-19292-Don-t-create-new-EOFException-in-BlockD.patch
>
>
> On our HBase clusters, while looking at CPU profiles, I noticed that about 1% 
> of overall CPU cycles are spent under BlockDecompressorStream#rawReadInt just 
> throwing EOFException. This could be easily avoided.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19288) hadoop-client-runtime exclude dnsjava InetAddressResolverProvider

2024-10-01 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19288:

Component/s: build

> hadoop-client-runtime exclude dnsjava InetAddressResolverProvider
> -
>
> Key: HADOOP-19288
> URL: https://issues.apache.org/jira/browse/HADOOP-19288
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.4.1
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> [https://github.com/dnsjava/dnsjava/issues/338]
>  
> {code:java}
> Exception in thread "main" java.util.ServiceConfigurationError: 
> java.net.spi.InetAddressResolverProvider: Provider 
> org.apache.hadoop.shaded.org.xbill.DNS.spi.DnsjavaInetAddressResolverProvider 
> not found
>     at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:593)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.nextProviderClass(ServiceLoader.java:1219)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1228)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1273)
>     at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1309)
>     at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1393)
>     at java.base/java.util.ServiceLoader.findFirst(ServiceLoader.java:1812)
>     at java.base/java.net.InetAddress.loadResolver(InetAddress.java:508)
>     at java.base/java.net.InetAddress.resolver(InetAddress.java:488)
>     at 
> java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1826)
>     at 
> java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:1139)
>     at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1818)
>     at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1931)
>     at 
> org.apache.logging.log4j.core.util.NetUtils.getLocalHostname(NetUtils.java:56)
>     at 
> org.apache.logging.log4j.core.LoggerContext.lambda$setConfiguration$0(LoggerContext.java:625)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-19288) hadoop-client-runtime exclude dnsjava InetAddressResolverProvider

2024-10-01 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-19288:
---

Assignee: dzcxzl

> hadoop-client-runtime exclude dnsjava InetAddressResolverProvider
> -
>
> Key: HADOOP-19288
> URL: https://issues.apache.org/jira/browse/HADOOP-19288
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Major
>  Labels: pull-request-available
>
> [https://github.com/dnsjava/dnsjava/issues/338]
>  
> {code:java}
> Exception in thread "main" java.util.ServiceConfigurationError: 
> java.net.spi.InetAddressResolverProvider: Provider 
> org.apache.hadoop.shaded.org.xbill.DNS.spi.DnsjavaInetAddressResolverProvider 
> not found
>     at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:593)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.nextProviderClass(ServiceLoader.java:1219)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1228)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1273)
>     at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1309)
>     at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1393)
>     at java.base/java.util.ServiceLoader.findFirst(ServiceLoader.java:1812)
>     at java.base/java.net.InetAddress.loadResolver(InetAddress.java:508)
>     at java.base/java.net.InetAddress.resolver(InetAddress.java:488)
>     at 
> java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1826)
>     at 
> java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:1139)
>     at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1818)
>     at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1931)
>     at 
> org.apache.logging.log4j.core.util.NetUtils.getLocalHostname(NetUtils.java:56)
>     at 
> org.apache.logging.log4j.core.LoggerContext.lambda$setConfiguration$0(LoggerContext.java:625)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19288) hadoop-client-runtime exclude dnsjava InetAddressResolverProvider

2024-10-01 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19288:

Affects Version/s: 3.4.1

> hadoop-client-runtime exclude dnsjava InetAddressResolverProvider
> -
>
> Key: HADOOP-19288
> URL: https://issues.apache.org/jira/browse/HADOOP-19288
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.4.1
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> [https://github.com/dnsjava/dnsjava/issues/338]
>  
> {code:java}
> Exception in thread "main" java.util.ServiceConfigurationError: 
> java.net.spi.InetAddressResolverProvider: Provider 
> org.apache.hadoop.shaded.org.xbill.DNS.spi.DnsjavaInetAddressResolverProvider 
> not found
>     at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:593)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.nextProviderClass(ServiceLoader.java:1219)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1228)
>     at 
> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1273)
>     at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1309)
>     at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1393)
>     at java.base/java.util.ServiceLoader.findFirst(ServiceLoader.java:1812)
>     at java.base/java.net.InetAddress.loadResolver(InetAddress.java:508)
>     at java.base/java.net.InetAddress.resolver(InetAddress.java:488)
>     at 
> java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1826)
>     at 
> java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:1139)
>     at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1818)
>     at java.base/java.net.InetAddress.getLocalHost(InetAddress.java:1931)
>     at 
> org.apache.logging.log4j.core.util.NetUtils.getLocalHostname(NetUtils.java:56)
>     at 
> org.apache.logging.log4j.core.LoggerContext.lambda$setConfiguration$0(LoggerContext.java:625)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   8   9   10   >