[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869437#comment-17869437 ] ASF GitHub Bot commented on HADOOP-19120: - steveloughran merged PR #6959: URL: https://github.com/apache/hadoop/pull/6959 > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868644#comment-17868644 ] ASF GitHub Bot commented on HADOOP-19120: - steveloughran commented on PR #6959: URL: https://github.com/apache/hadoop/pull/6959#issuecomment-2250077439 thanks. just testing the trunk code locally, to see if there are any problems first > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868541#comment-17868541 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2249408482 Thank you very much @steveloughran. Really thankful to you for your time and energy in this. Have raised a backport PR https://github.com/apache/hadoop/pull/6959 on branch-3.4. Thank you very much! > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868540#comment-17868540 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6959: URL: https://github.com/apache/hadoop/pull/6959#issuecomment-2249402554 Hi @steveloughran , This is backport of trunk PR https://github.com/apache/hadoop/pull/6633. Requesting your kind review please. Thank you very much! > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868429#comment-17868429 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6959: URL: https://github.com/apache/hadoop/pull/6959#issuecomment-2248484596 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 4s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ branch-3.4 Compile Tests _ | | +0 :ok: | mvndep | 14m 12s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 34m 25s | | branch-3.4 passed | | +1 :green_heart: | compile | 27m 11s | | branch-3.4 passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | compile | 21m 35s | | branch-3.4 passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | checkstyle | 5m 8s | | branch-3.4 passed | | +1 :green_heart: | mvnsite | 2m 27s | | branch-3.4 passed | | +1 :green_heart: | javadoc | 2m 0s | | branch-3.4 passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 27s | | branch-3.4 passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 3m 59s | | branch-3.4 passed | | +1 :green_heart: | shadedclient | 38m 35s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 27s | | the patch passed | | +1 :green_heart: | compile | 24m 49s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javac | 24m 49s | | the patch passed | | +1 :green_heart: | compile | 22m 57s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | javac | 22m 57s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 58s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6959/2/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 18 unchanged - 0 fixed = 20 total (was 18) | | +1 :green_heart: | mvnsite | 2m 31s | | the patch passed | | +1 :green_heart: | javadoc | 1m 54s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 30s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 4m 34s | | the patch passed | | +1 :green_heart: | shadedclient | 38m 52s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 20m 51s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 33s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 58s | | The patch does not generate ASF License warnings. | | | | 284m 6s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6959/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6959 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux a5589c6342ac 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.4 / 48ba0ad1fa40d984dd1b21892fb3698514852564 | | Default Java | Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868428#comment-17868428 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6959: URL: https://github.com/apache/hadoop/pull/6959#issuecomment-2248483299 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 53s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ branch-3.4 Compile Tests _ | | +0 :ok: | mvndep | 14m 7s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 34m 48s | | branch-3.4 passed | | +1 :green_heart: | compile | 26m 39s | | branch-3.4 passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | compile | 23m 40s | | branch-3.4 passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | checkstyle | 4m 54s | | branch-3.4 passed | | +1 :green_heart: | mvnsite | 2m 35s | | branch-3.4 passed | | +1 :green_heart: | javadoc | 1m 58s | | branch-3.4 passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 27s | | branch-3.4 passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 3m 59s | | branch-3.4 passed | | +1 :green_heart: | shadedclient | 37m 4s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 33s | | the patch passed | | +1 :green_heart: | compile | 24m 29s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javac | 24m 29s | | the patch passed | | +1 :green_heart: | compile | 22m 13s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | javac | 22m 13s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 49s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6959/3/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 18 unchanged - 0 fixed = 20 total (was 18) | | +1 :green_heart: | mvnsite | 2m 30s | | the patch passed | | +1 :green_heart: | javadoc | 1m 53s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 28s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 4m 33s | | the patch passed | | +1 :green_heart: | shadedclient | 38m 1s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 20m 44s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 35s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 1m 0s | | The patch does not generate ASF License warnings. | | | | 282m 8s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6959/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6959 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux ac10b97b1a59 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.4 / 48ba0ad1fa40d984dd1b21892fb3698514852564 | | Default Java | Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868348#comment-17868348 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6959: URL: https://github.com/apache/hadoop/pull/6959#issuecomment-2247729029 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 17m 47s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ branch-3.4 Compile Tests _ | | +0 :ok: | mvndep | 14m 33s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 35m 42s | | branch-3.4 passed | | +1 :green_heart: | compile | 19m 39s | | branch-3.4 passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | compile | 18m 12s | | branch-3.4 passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | checkstyle | 4m 43s | | branch-3.4 passed | | +1 :green_heart: | mvnsite | 2m 31s | | branch-3.4 passed | | +1 :green_heart: | javadoc | 2m 0s | | branch-3.4 passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 34s | | branch-3.4 passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 3m 44s | | branch-3.4 passed | | +1 :green_heart: | shadedclient | 39m 28s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 31s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 23s | | the patch passed | | +1 :green_heart: | compile | 18m 38s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javac | 18m 38s | | the patch passed | | +1 :green_heart: | compile | 18m 17s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | javac | 18m 17s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 40s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6959/1/artifact/out/results-checkstyle-root.txt) | root: The patch generated 4 new + 18 unchanged - 0 fixed = 22 total (was 18) | | +1 :green_heart: | mvnsite | 2m 29s | | the patch passed | | +1 :green_heart: | javadoc | 1m 53s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 35s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 4m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 39m 59s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 20m 1s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 33s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 1m 1s | | The patch does not generate ASF License warnings. | | | | 280m 56s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6959/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6959 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux d1f7dd27beb9 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.4 / 99c6095b404d3a5e5a89ad529449eeefe1509980 | | Default Java | Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868274#comment-17868274 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6959: URL: https://github.com/apache/hadoop/pull/6959#issuecomment-2247090911 -- AGGREGATED TEST RESULT HNS-OAuth [WARNING] Tests run: 153, Failures: 0, Errors: 0, Skipped: 2 [WARNING] Tests run: 644, Failures: 0, Errors: 0, Skipped: 82 [WARNING] Tests run: 424, Failures: 0, Errors: 0, Skipped: 57 HNS-SharedKey [WARNING] Tests run: 153, Failures: 0, Errors: 0, Skipped: 3 [WARNING] Tests run: 644, Failures: 0, Errors: 0, Skipped: 34 [WARNING] Tests run: 424, Failures: 0, Errors: 0, Skipped: 44 NonHNS-SharedKey [WARNING] Tests run: 153, Failures: 0, Errors: 0, Skipped: 9 [WARNING] Tests run: 628, Failures: 0, Errors: 0, Skipped: 274 [WARNING] Tests run: 424, Failures: 0, Errors: 0, Skipped: 47 AppendBlob-HNS-OAuth [WARNING] Tests run: 153, Failures: 0, Errors: 0, Skipped: 2 [WARNING] Tests run: 644, Failures: 0, Errors: 0, Skipped: 84 [WARNING] Tests run: 424, Failures: 0, Errors: 0, Skipped: 81 Time taken: 28 mins 30 secs. azureuser@pranav-ind-vm:~/hadoop/hadoop-tools/hadoop-azure$ git log commit 99c6095b404d3a5e5a89ad529449eeefe1509980 (HEAD -> saxenapranav/abfs-apachehttpclient-3.4, origin/saxenapranav/abfs-apachehttpclient-3.4) Author: Pranav Saxena <> Date: Tue Jul 23 21:42:37 2024 -0700 cherrypick of b60497ff41e1dc149d1610f4cc6ea4e0609f9946 : https://github.com/apache/hadoop/commit/b60497ff41e1dc149d1610f4cc6ea4e0609f9946 : ApacheHttpClient adaptation in ABFS. https://github.com/apache/hadoop/pull/6633 > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868272#comment-17868272 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav opened a new pull request, #6959: URL: https://github.com/apache/hadoop/pull/6959 JIRA: https://issues.apache.org/jira/browse/HADOOP-19120 trunk pr: https://github.com/apache/hadoop/pull/6633 Apache httpclient 4.5.x is the new default implementation of http connections; this supports a large configurable pool of connections along with the ability to limit their lifespan. The networking library can be chosen using the configuration option fs.azure.networking.library The supported values are - APACHE_HTTP_CLIENT : Use Apache HttpClient [Default] - JDK_HTTP_URL_CONNECTION : Use JDK networking library > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867826#comment-17867826 ] ASF GitHub Bot commented on HADOOP-19120: - steveloughran commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2243524698 ok! merged! please do the branch 3.4 backport and test. because httpclient is the new default, "compile" is the correct maven scope. I will look at the cloud storage depenencies > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867825#comment-17867825 ] ASF GitHub Bot commented on HADOOP-19120: - steveloughran merged PR #6633: URL: https://github.com/apache/hadoop/pull/6633 > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867665#comment-17867665 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2242162459 Hi @steveloughran, thanks a lot for all the help in the review. Have added the documentation for the ease of developers. Requesting your kind help in the final review please. Really thankful for your time in this. Thank you very much! > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867167#comment-17867167 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2238088749 Thank you @steveloughran for the feedbacks. Have taken them. Requesting your kind review and approval please. Thank you very much! > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865850#comment-17865850 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2227641911 Thank you @steveloughran very much for the reviews. Have taken the suggestions. Requesting your kind review please. Thank you! > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865848#comment-17865848 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2227641085 -- AGGREGATED TEST RESULT HNS-OAuth [WARNING] Tests run: 154, Failures: 0, Errors: 0, Skipped: 3 [WARNING] Tests run: 646, Failures: 0, Errors: 0, Skipped: 85 [WARNING] Tests run: 424, Failures: 0, Errors: 0, Skipped: 57 HNS-SharedKey [WARNING] Tests run: 154, Failures: 0, Errors: 0, Skipped: 4 [WARNING] Tests run: 646, Failures: 0, Errors: 0, Skipped: 37 [WARNING] Tests run: 424, Failures: 0, Errors: 0, Skipped: 44 NonHNS-SharedKey [ERROR] testUpdateDeepDirectoryStructureToRemote(org.apache.hadoop.fs.azurebfs.contract.ITestAbfsFileSystemContractDistCp) Time elapsed: 2.858 s <<< FAILURE! [WARNING] Tests run: 154, Failures: 0, Errors: 0, Skipped: 10 [WARNING] Tests run: 630, Failures: 0, Errors: 0, Skipped: 277 [ERROR] Tests run: 424, Failures: 1, Errors: 0, Skipped: 47 AppendBlob-HNS-OAuth [WARNING] Tests run: 154, Failures: 0, Errors: 0, Skipped: 3 [WARNING] Tests run: 646, Failures: 0, Errors: 0, Skipped: 87 [WARNING] Tests run: 424, Failures: 0, Errors: 0, Skipped: 81 Time taken: 26 mins 22 secs. azureuser@pranav-ind-vm:~/hadoop/hadoop-tools/hadoop-azure$ git log commit f896d84d25a7991b21a5f835a468ff2afae649fd (origin/saxenapranav/abfs-apachehttpclient) Author: Pranav Saxena <> Date: Thu Jul 11 06:48:53 2024 -0700 eol fix > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865485#comment-17865485 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2225629718 > ok, let's ee what this test run says. > > One thing I've just realised is that there is no public documentation for this. Which it needs > > proposed: somewhere you add a section on this in the public docs > > * list the new options > * why httpclient is needed on the classpath > * show how to fall back to jvm > > This can be done as a followup: what is key is "people shouldn't need to read the source to see this stuff" Thank you @steveloughran . Makes sense! Have added the documentation as part of https://hadoop.apache.org/docs/stable/hadoop-azure/abfs.html#Technical_notes. Thank you very much! > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865199#comment-17865199 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2223730129 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 22 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 56s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 32m 34s | | trunk passed | | +1 :green_heart: | compile | 17m 31s | | trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | compile | 16m 30s | | trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | checkstyle | 4m 20s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 42s | | trunk passed | | +1 :green_heart: | javadoc | 2m 8s | | trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 45s | | trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 3m 54s | | trunk passed | | +1 :green_heart: | shadedclient | 34m 30s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 57s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 34s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 29s | | the patch passed | | +1 :green_heart: | compile | 17m 8s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javac | 17m 8s | | the patch passed | | +1 :green_heart: | compile | 16m 2s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | javac | 16m 2s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 18s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/72/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 18 unchanged - 0 fixed = 20 total (was 18) | | +1 :green_heart: | mvnsite | 2m 38s | | the patch passed | | +1 :green_heart: | javadoc | 2m 3s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 46s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 4m 15s | | the patch passed | | +1 :green_heart: | shadedclient | 34m 47s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 30s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 41s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 1m 4s | | The patch does not generate ASF License warnings. | | | | 244m 4s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/72/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux 9215a14bf072 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / f896d84d25a7991b21a5f835a468ff2afae649fd | | Default Java | Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | Multi-JDK versions | /usr/lib/jvm/java
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865123#comment-17865123 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2223210231 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 22 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 35s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 32m 21s | | trunk passed | | +1 :green_heart: | compile | 17m 29s | | trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | compile | 15m 50s | | trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | checkstyle | 4m 28s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 37s | | trunk passed | | +1 :green_heart: | javadoc | 2m 6s | | trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 44s | | trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 3m 56s | | trunk passed | | +1 :green_heart: | shadedclient | 34m 44s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 35m 11s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 33s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 28s | | the patch passed | | +1 :green_heart: | compile | 16m 34s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javac | 16m 34s | | the patch passed | | +1 :green_heart: | compile | 16m 15s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | javac | 16m 15s | | the patch passed | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/71/artifact/out/blanks-eol.txt) | The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | -0 :warning: | checkstyle | 4m 17s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/71/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 18 unchanged - 0 fixed = 20 total (was 18) | | +1 :green_heart: | mvnsite | 2m 38s | | the patch passed | | +1 :green_heart: | javadoc | 2m 2s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 43s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 4m 14s | | the patch passed | | +1 :green_heart: | shadedclient | 34m 20s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 20m 8s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 42s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 1m 5s | | The patch does not generate ASF License warnings. | | | | 243m 4s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/71/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint | | uname | Linux 616bab91c3ed 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hado
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865009#comment-17865009 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-656550 Thanks @steveloughran . Makes sense! Have added the documentation in https://hadoop.apache.org/docs/stable/hadoop-azure/abfs.html#Technical_notes. Thank you very much! > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864291#comment-17864291 ] ASF GitHub Bot commented on HADOOP-19120: - steveloughran commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2218245792 ok, let's ee what this test run says. One thing I've just realised is that there is no public documentation for this. Which it needs proposed: somewhere you add a section on this in the public docs * list the new options * why httpclient is needed on the classpath * show how to fall back to jvm This can be done as a followup: what is key is "people shouldn't need to read the source to see this stuff" > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864125#comment-17864125 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2217283933 > I like this; a lot cleaner. almost ready to go in. > > +1 pending a couple of minor comments. Thank you very much @steveloughran, have taken the comments. Thank you so much! > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864123#comment-17864123 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1670238068 ## hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestApacheClientConnectionPool.java: ## @@ -76,14 +77,46 @@ public void testPoolWithZeroSysProp() throws Exception { @Test public void testEmptySizePool() throws Exception { Configuration configuration = new Configuration(); -configuration.set(FS_AZURE_APACHE_HTTP_CLIENT_MAX_CACHE_CONNECTION_SIZE, "0"); -AbfsConfiguration abfsConfiguration = new AbfsConfiguration(configuration, EMPTY_STRING); -try (KeepAliveCache keepAliveCache = new KeepAliveCache(abfsConfiguration)) { - Assertions.assertThat(keepAliveCache.put(Mockito.mock(HttpClientConnection.class))).isFalse(); - Assertions.assertThat(keepAliveCache.get()).isNull(); +configuration.set(FS_AZURE_APACHE_HTTP_CLIENT_MAX_CACHE_CONNECTION_SIZE, +"0"); +AbfsConfiguration abfsConfiguration = new AbfsConfiguration(configuration, +EMPTY_STRING); +try (KeepAliveCache keepAliveCache = new KeepAliveCache( +abfsConfiguration)) { + assertCachePutFail(keepAliveCache, + Mockito.mock(HttpClientConnection.class)); + assertCacheGetNull(keepAliveCache); } } + private void assertCacheGetNull(final KeepAliveCache keepAliveCache) Review Comment: Makes sense. Have rename to `assertCacheGetIsNul`l and `assertCacheGetIsNonNull` ## hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestApacheClientConnectionPool.java: ## @@ -76,14 +77,46 @@ public void testPoolWithZeroSysProp() throws Exception { @Test public void testEmptySizePool() throws Exception { Configuration configuration = new Configuration(); -configuration.set(FS_AZURE_APACHE_HTTP_CLIENT_MAX_CACHE_CONNECTION_SIZE, "0"); -AbfsConfiguration abfsConfiguration = new AbfsConfiguration(configuration, EMPTY_STRING); -try (KeepAliveCache keepAliveCache = new KeepAliveCache(abfsConfiguration)) { - Assertions.assertThat(keepAliveCache.put(Mockito.mock(HttpClientConnection.class))).isFalse(); - Assertions.assertThat(keepAliveCache.get()).isNull(); +configuration.set(FS_AZURE_APACHE_HTTP_CLIENT_MAX_CACHE_CONNECTION_SIZE, +"0"); +AbfsConfiguration abfsConfiguration = new AbfsConfiguration(configuration, +EMPTY_STRING); +try (KeepAliveCache keepAliveCache = new KeepAliveCache( +abfsConfiguration)) { + assertCachePutFail(keepAliveCache, + Mockito.mock(HttpClientConnection.class)); + assertCacheGetNull(keepAliveCache); } } + private void assertCacheGetNull(final KeepAliveCache keepAliveCache) Review Comment: Makes sense. Have renamed to `assertCacheGetIsNul`l and `assertCacheGetIsNonNull` > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the to
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864122#comment-17864122 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1670237031 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/KeepAliveCache.java: ## @@ -96,6 +97,8 @@ class KeepAliveCache extends Stack */ private final AtomicBoolean isPaused = new AtomicBoolean(false); + private final String accountNamePath; Review Comment: Makes sense. Have taken it. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864124#comment-17864124 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1670238068 ## hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestApacheClientConnectionPool.java: ## @@ -76,14 +77,46 @@ public void testPoolWithZeroSysProp() throws Exception { @Test public void testEmptySizePool() throws Exception { Configuration configuration = new Configuration(); -configuration.set(FS_AZURE_APACHE_HTTP_CLIENT_MAX_CACHE_CONNECTION_SIZE, "0"); -AbfsConfiguration abfsConfiguration = new AbfsConfiguration(configuration, EMPTY_STRING); -try (KeepAliveCache keepAliveCache = new KeepAliveCache(abfsConfiguration)) { - Assertions.assertThat(keepAliveCache.put(Mockito.mock(HttpClientConnection.class))).isFalse(); - Assertions.assertThat(keepAliveCache.get()).isNull(); +configuration.set(FS_AZURE_APACHE_HTTP_CLIENT_MAX_CACHE_CONNECTION_SIZE, +"0"); +AbfsConfiguration abfsConfiguration = new AbfsConfiguration(configuration, +EMPTY_STRING); +try (KeepAliveCache keepAliveCache = new KeepAliveCache( +abfsConfiguration)) { + assertCachePutFail(keepAliveCache, + Mockito.mock(HttpClientConnection.class)); + assertCacheGetNull(keepAliveCache); } } + private void assertCacheGetNull(final KeepAliveCache keepAliveCache) Review Comment: Makes sense. Have renamed to `assertCacheGetIsNull` and `assertCacheGetIsNonNull` > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864121#comment-17864121 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1670237031 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/KeepAliveCache.java: ## @@ -96,6 +97,8 @@ class KeepAliveCache extends Stack */ private final AtomicBoolean isPaused = new AtomicBoolean(false); + private final String accountNamePath; Review Comment: Make sense. Have taken it. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864093#comment-17864093 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2217175968 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 11m 57s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 22 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 40s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 32m 50s | | trunk passed | | +1 :green_heart: | compile | 17m 26s | | trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | compile | 16m 3s | | trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | checkstyle | 4m 27s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 39s | | trunk passed | | +1 :green_heart: | javadoc | 2m 6s | | trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 45s | | trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 3m 52s | | trunk passed | | +1 :green_heart: | shadedclient | 35m 50s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 36m 17s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 34s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 28s | | the patch passed | | +1 :green_heart: | compile | 16m 49s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javac | 16m 49s | | the patch passed | | +1 :green_heart: | compile | 16m 15s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | javac | 16m 15s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 18s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/70/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 18 unchanged - 0 fixed = 20 total (was 18) | | +1 :green_heart: | mvnsite | 2m 38s | | the patch passed | | +1 :green_heart: | javadoc | 1m 56s | | the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 | | +1 :green_heart: | javadoc | 1m 44s | | the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | +1 :green_heart: | spotbugs | 4m 13s | | the patch passed | | +1 :green_heart: | shadedclient | 34m 33s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 33s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 41s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 1m 4s | | The patch does not generate ASF License warnings. | | | | 255m 46s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/70/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux ae15abc71c90 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / fed0a5a5a36946941c61e3d59671b5dbcc69a804 | | Default Java | Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863816#comment-17863816 ] ASF GitHub Bot commented on HADOOP-19120: - steveloughran commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1668758494 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/KeepAliveCache.java: ## @@ -96,6 +97,8 @@ class KeepAliveCache extends Stack */ private final AtomicBoolean isPaused = new AtomicBoolean(false); + private final String accountNamePath; Review Comment: add javadoc to say this is only used for exception messages. ## hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestApacheClientConnectionPool.java: ## @@ -76,14 +77,46 @@ public void testPoolWithZeroSysProp() throws Exception { @Test public void testEmptySizePool() throws Exception { Configuration configuration = new Configuration(); -configuration.set(FS_AZURE_APACHE_HTTP_CLIENT_MAX_CACHE_CONNECTION_SIZE, "0"); -AbfsConfiguration abfsConfiguration = new AbfsConfiguration(configuration, EMPTY_STRING); -try (KeepAliveCache keepAliveCache = new KeepAliveCache(abfsConfiguration)) { - Assertions.assertThat(keepAliveCache.put(Mockito.mock(HttpClientConnection.class))).isFalse(); - Assertions.assertThat(keepAliveCache.get()).isNull(); +configuration.set(FS_AZURE_APACHE_HTTP_CLIENT_MAX_CACHE_CONNECTION_SIZE, +"0"); +AbfsConfiguration abfsConfiguration = new AbfsConfiguration(configuration, +EMPTY_STRING); +try (KeepAliveCache keepAliveCache = new KeepAliveCache( +abfsConfiguration)) { + assertCachePutFail(keepAliveCache, + Mockito.mock(HttpClientConnection.class)); + assertCacheGetNull(keepAliveCache); } } + private void assertCacheGetNull(final KeepAliveCache keepAliveCache) Review Comment: can you use isNull and isNotNull to make clearer when reading what is being asserted? > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863812#comment-17863812 ] ASF GitHub Bot commented on HADOOP-19120: - steveloughran commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2214233797 i was on vacation for a week. it was a july4 week and its always quiet > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863656#comment-17863656 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2212930442 Hi @steveloughran, thank you so much for the review. Have taken the comments. Requesting your kind review on the changes please. Thank you very much! > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17855128#comment-17855128 ] ASF GitHub Bot commented on HADOOP-19120: - steveloughran commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1640229114 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/KeepAliveCache.java: ## @@ -48,296 +50,214 @@ * number of connections it can create. * */ -public final class KeepAliveCache -extends HashMap -implements Runnable { +public final class KeepAliveCache extends Stack +implements +Closeable { - private int maxConn; + /** + * Scheduled timer that evicts idle connections. + */ + private final Timer timer; - private long connectionIdleTTL = KAC_DEFAULT_CONN_TTL; + /** + * Task provided to the timer that owns eviction logic. + */ + private final TimerTask timerTask; - private Thread keepAliveTimer = null; + /** + * Flag to indicate if the cache is closed. + */ + private boolean isClosed; - private boolean isPaused = false; + /** + * Counter to keep track of the number of KeepAliveCache instances created. + */ + private static final AtomicInteger KAC_COUNTER = new AtomicInteger(0); - private KeepAliveCache() { -setMaxConn(); - } + /** + * Maximum number of connections that can be cached. + */ + private final int maxConn; + + /** + * Time-to-live for an idle connection. + */ + private final long connectionIdleTTL; + + /** + * Flag to indicate if the eviction thread is paused. + */ + private boolean isPaused = false; + @VisibleForTesting synchronized void pauseThread() { isPaused = true; } + @VisibleForTesting synchronized void resumeThread() { isPaused = false; -notify(); } - private void setMaxConn() { + /** + * @return connectionIdleTTL Review Comment: nit: add a . to keep javadoc happy ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/KeepAliveCache.java: ## @@ -48,296 +50,214 @@ * number of connections it can create. * */ -public final class KeepAliveCache -extends HashMap -implements Runnable { +public final class KeepAliveCache extends Stack +implements +Closeable { - private int maxConn; + /** + * Scheduled timer that evicts idle connections. + */ + private final Timer timer; - private long connectionIdleTTL = KAC_DEFAULT_CONN_TTL; + /** + * Task provided to the timer that owns eviction logic. + */ + private final TimerTask timerTask; - private Thread keepAliveTimer = null; + /** + * Flag to indicate if the cache is closed. + */ + private boolean isClosed; - private boolean isPaused = false; + /** + * Counter to keep track of the number of KeepAliveCache instances created. + */ + private static final AtomicInteger KAC_COUNTER = new AtomicInteger(0); - private KeepAliveCache() { -setMaxConn(); - } + /** + * Maximum number of connections that can be cached. + */ + private final int maxConn; + + /** + * Time-to-live for an idle connection. + */ + private final long connectionIdleTTL; + + /** + * Flag to indicate if the eviction thread is paused. + */ + private boolean isPaused = false; + @VisibleForTesting synchronized void pauseThread() { isPaused = true; } + @VisibleForTesting synchronized void resumeThread() { isPaused = false; -notify(); } - private void setMaxConn() { + /** + * @return connectionIdleTTL + */ + @VisibleForTesting + public long getConnectionIdleTTL() { +return connectionIdleTTL; + } + + public KeepAliveCache(AbfsConfiguration abfsConfiguration) { Review Comment: javadoc ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/KeepAliveCache.java: ## @@ -48,296 +50,214 @@ * number of connections it can create. * */ -public final class KeepAliveCache -extends HashMap -implements Runnable { +public final class KeepAliveCache extends Stack +implements +Closeable { - private int maxConn; + /** + * Scheduled timer that evicts idle connections. + */ + private final Timer timer; - private long connectionIdleTTL = KAC_DEFAULT_CONN_TTL; + /** + * Task provided to the timer that owns eviction logic. + */ + private final TimerTask timerTask; - private Thread keepAliveTimer = null; + /** + * Flag to indicate if the cache is closed. + */ + private boolean isClosed; - private boolean isPaused = false; + /** + * Counter to keep track of the number of KeepAliveCache instances created. + */ + private static final AtomicInteger KAC_COUNTER = new AtomicInteger(0); - private KeepAliveCache() { -setMaxConn(); - } + /** + * Maximum number of connections that can be cached. +
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854394#comment-17854394 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2162706178 > Overall, I like the design, especially the way that it is possible to switch back to the existing connections and in future move to the java11+ API. > > However, there is an absolute -1, KeepAliveCache contains code copied from the JDK. You get to delete all code which is duplicate and review it very carefully to make sure this is the case. > > I am not happy with us creating a long lived thread that never goes away. Yes, it is done for the JVM, but that's a long-standing design decision of theirs. Instead, make this one per abfs instance. When the filesystem is closed this cache can/should be shut down. Or is there something I am missing here -such as how it integrates with the JDK? > > As usual, I've complained about javadocs a lot. This is for the people that come to maintain it in the future, yourself included -and for IDE popups. Thank you @steveloughran for the review. Really appreciate your time in this. I have now differentiated KeepAliveCache code from the JDK's implementation. Now, each filesystem would have an instance of KeepAliveCache which maintains connection pooling for that filesystem. The cache would have a timer that would be running till the lifecycle of the filesystem. When filesystem is closed, the cache gets closed, and the timer also gets closed. I have taken the comments, requesting your kind review please. Thank you a lot again for your time and insights. Thanks! > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854392#comment-17854392 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636235723 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/KeepAliveCache.java: ## @@ -0,0 +1,362 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.NotSerializableException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Map; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.azurebfs.AbfsConfiguration; +import org.apache.http.HttpClientConnection; +import org.apache.http.conn.routing.HttpRoute; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.DEFAULT_MAX_CONN_SYS_PROP; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_MAX_CONN_SYS_PROP; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.KAC_DEFAULT_CONN_TTL; + +/** + * Connection-pooling heuristics adapted from JDK's connection pooling `KeepAliveCache` + * + * Why this implementation is required in comparison to {@link org.apache.http.impl.conn.PoolingHttpClientConnectionManager} + * connection-pooling: + * + * PoolingHttpClientConnectionManager heuristic caches all the reusable connections it has created. + * JDK's implementation only caches limited number of connections. The limit is given by JVM system + * property "http.maxConnections". If there is no system-property, it defaults to 5. + * In PoolingHttpClientConnectionManager, it expects the application to provide `setMaxPerRoute` and `setMaxTotal`, + * which the implementation uses as the total number of connections it can create. For application using ABFS, it is not + * feasible to provide a value in the initialisation of the connectionManager. JDK's implementation has no cap on the + * number of connections it can create. + * + */ +public final class KeepAliveCache +extends HashMap +implements Runnable { + + private int maxConn; + + private long connectionIdleTTL = KAC_DEFAULT_CONN_TTL; + + private Thread keepAliveTimer = null; + + private boolean isPaused = false; + + private KeepAliveCache() { +setMaxConn(); + } + + synchronized void pauseThread() { +isPaused = true; + } + + synchronized void resumeThread() { +isPaused = false; +notify(); + } + + private void setMaxConn() { +String sysPropMaxConn = System.getProperty(HTTP_MAX_CONN_SYS_PROP); +if (sysPropMaxConn == null) { + maxConn = DEFAULT_MAX_CONN_SYS_PROP; +} else { + maxConn = Integer.parseInt(sysPropMaxConn); +} + } + + public void setAbfsConfig(AbfsConfiguration abfsConfiguration) { +this.maxConn = abfsConfiguration.getMaxApacheHttpClientCacheConnections(); +this.connectionIdleTTL = abfsConfiguration.getMaxApacheHttpClientConnectionIdleTime(); + } + + public long getConnectionIdleTTL() { +return connectionIdleTTL; + } + + private static final KeepAliveCache INSTANCE = new KeepAliveCache(); + + public static KeepAliveCache getInstance() { +return INSTANCE; + } + + @VisibleForTesting + void clearThread() { +clear(); +setMaxConn(); + } + + private int getKacSize() { +return INSTANCE.maxConn; + } + + @Override + public void run() { +do { + synchronized (this) { +while (isPaused) { + try { +wait(); + } catch (InterruptedException ignored) { + } +} + } + kacCleanup(); +} while (size() > 0); + } + + private void kacCleanup() { +try { + Thread.sleep(connectionIdleTTL); +} catch (InterruptedException ex) { + return; +} +synchronized (this) { + long currentTime = System.currentTimeMillis(); + + ArrayList keysToRemove + = new ArrayList(); + + for (Map.Entry entry : entrySet()) { +KeepAli
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854391#comment-17854391 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636235267 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/KeepAliveCache.java: ## @@ -0,0 +1,362 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.NotSerializableException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Map; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.azurebfs.AbfsConfiguration; +import org.apache.http.HttpClientConnection; +import org.apache.http.conn.routing.HttpRoute; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.DEFAULT_MAX_CONN_SYS_PROP; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_MAX_CONN_SYS_PROP; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.KAC_DEFAULT_CONN_TTL; + +/** + * Connection-pooling heuristics adapted from JDK's connection pooling `KeepAliveCache` + * + * Why this implementation is required in comparison to {@link org.apache.http.impl.conn.PoolingHttpClientConnectionManager} + * connection-pooling: + * + * PoolingHttpClientConnectionManager heuristic caches all the reusable connections it has created. + * JDK's implementation only caches limited number of connections. The limit is given by JVM system + * property "http.maxConnections". If there is no system-property, it defaults to 5. + * In PoolingHttpClientConnectionManager, it expects the application to provide `setMaxPerRoute` and `setMaxTotal`, + * which the implementation uses as the total number of connections it can create. For application using ABFS, it is not + * feasible to provide a value in the initialisation of the connectionManager. JDK's implementation has no cap on the + * number of connections it can create. + * + */ +public final class KeepAliveCache Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854390#comment-17854390 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636234636 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/KeepAliveCache.java: ## @@ -0,0 +1,362 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.NotSerializableException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Map; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.azurebfs.AbfsConfiguration; +import org.apache.http.HttpClientConnection; +import org.apache.http.conn.routing.HttpRoute; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.DEFAULT_MAX_CONN_SYS_PROP; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_MAX_CONN_SYS_PROP; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.KAC_DEFAULT_CONN_TTL; + +/** + * Connection-pooling heuristics adapted from JDK's connection pooling `KeepAliveCache` Review Comment: Have now differentiated from jdk's implementation. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854389#comment-17854389 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636234165 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsRestOperation.java: ## @@ -95,11 +98,15 @@ public class AbfsRestOperation { private String failureReason; private AbfsRetryPolicy retryPolicy; + private final AbfsConfiguration abfsConfiguration; + /** * This variable stores the tracing context used for last Rest Operation. */ private TracingContext lastUsedTracingContext; + private int apacheHttpClientIoExceptions = 0; Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854387#comment-17854387 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636232942 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsJdkHttpOperation.java: ## @@ -0,0 +1,345 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.io.OutputStream; +import java.net.HttpURLConnection; +import java.net.ProtocolException; +import java.net.URL; +import java.util.List; +import java.util.Map; + +import javax.net.ssl.HttpsURLConnection; +import javax.net.ssl.SSLSocketFactory; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EXPECT_100_JDK_ERROR; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HUNDRED_CONTINUE; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.JDK_FALLBACK; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.JDK_IMPL; +import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.EXPECT; + +/** + * Implementation of {@link AbfsHttpOperation} for orchestrating calls using JDK's HttpURLConnection. + */ +public class AbfsJdkHttpOperation extends AbfsHttpOperation { + + private static final Logger LOG = LoggerFactory.getLogger( + AbfsJdkHttpOperation.class); + + private HttpURLConnection connection; Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message w
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854388#comment-17854388 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636233320 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsManagedHttpClientContext.java: ## @@ -0,0 +1,70 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import org.apache.http.HttpClientConnection; +import org.apache.http.client.protocol.HttpClientContext; + +public class AbfsManagedHttpClientContext extends HttpClientContext { Review Comment: taken. ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsNoOpThrottlingIntercept.java: ## @@ -18,6 +18,8 @@ package org.apache.hadoop.fs.azurebfs.services; +import org.apache.hadoop.fs.azurebfs.constants.AbfsRestOperationType; + final class AbfsNoOpThrottlingIntercept implements AbfsThrottlingIntercept { Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854386#comment-17854386 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636232649 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java: ## @@ -254,158 +311,19 @@ public String getMaskedEncodedUrl() { return maskedEncodedUrl; } - /** - * Initializes a new HTTP request and opens the connection. - * - * @param url The full URL including query string parameters. - * @param method The HTTP method (PUT, PATCH, POST, GET, HEAD, or DELETE). - * @param requestHeaders The HTTP request headers.READ_TIMEOUT - * @param connectionTimeout The Connection Timeout value to be used while establishing http connection - * @param readTimeout The Read Timeout value to be used with http connection while making a request - * @throws IOException if an error occurs. - */ - public AbfsHttpOperation(final URL url, final String method, final List requestHeaders, Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854384#comment-17854384 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636228000 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java: ## @@ -103,13 +111,31 @@ public static AbfsHttpOperation getAbfsHttpOperationWithFixedResult( protected AbfsHttpOperation(final URL url, final String method, final int httpStatus) { +this.log = null; Review Comment: Removed any flow which can put null here. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854383#comment-17854383 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636228000 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java: ## @@ -103,13 +111,31 @@ public static AbfsHttpOperation getAbfsHttpOperationWithFixedResult( protected AbfsHttpOperation(final URL url, final String method, final int httpStatus) { +this.log = null; Review Comment: Remove any flow which can put null here. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854381#comment-17854381 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636226797 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsConnectionManager.java: ## @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; + +import org.apache.http.HttpClientConnection; +import org.apache.http.config.Registry; +import org.apache.http.config.SocketConfig; +import org.apache.http.conn.ConnectionPoolTimeoutException; +import org.apache.http.conn.ConnectionRequest; +import org.apache.http.conn.HttpClientConnectionManager; +import org.apache.http.conn.HttpClientConnectionOperator; +import org.apache.http.conn.routing.HttpRoute; +import org.apache.http.conn.socket.ConnectionSocketFactory; +import org.apache.http.impl.conn.DefaultHttpClientConnectionOperator; +import org.apache.http.impl.conn.ManagedHttpClientConnectionFactory; +import org.apache.http.protocol.HttpContext; + +/** + * AbfsConnectionManager is a custom implementation of {@link HttpClientConnectionManager}. + * This implementation manages connection-pooling heuristics and custom implementation + * of {@link ManagedHttpClientConnectionFactory}. + */ +class AbfsConnectionManager implements HttpClientConnectionManager { + + private final KeepAliveCache kac = KeepAliveCache.getInstance(); + + private final AbfsConnFactory httpConnectionFactory; + + private final HttpClientConnectionOperator connectionOperator; + + AbfsConnectionManager(Registry socketFactoryRegistry, + AbfsConnFactory connectionFactory) { +this.httpConnectionFactory = connectionFactory; +connectionOperator = new DefaultHttpClientConnectionOperator( +socketFactoryRegistry, null, null); + } + + @Override + public ConnectionRequest requestConnection(final HttpRoute route, + final Object state) { +return new ConnectionRequest() { + @Override + public HttpClientConnection get(final long timeout, + final TimeUnit timeUnit) + throws InterruptedException, ExecutionException, + ConnectionPoolTimeoutException { +try { + HttpClientConnection clientConn = kac.get(route); + if (clientConn != null) { +return clientConn; + } + return httpConnectionFactory.create(route, null); +} catch (IOException ex) { + throw new ExecutionException(ex); +} + } + + @Override + public boolean cancel() { +return false; + } +}; + } + + /** + * Releases a connection for reuse. It can be reused only if validDuration is greater than 0. + * This method is called by {@link org.apache.http.impl.execchain} internal class `ConnectionHolder`. + * If it wants to reuse the connection, it will send a non-zero validDuration, else it will send 0. + * @param conn the connection to release + * @param newState the new state of the connection + * @param validDuration the duration for which the connection is valid + * @param timeUnit the time unit for the validDuration + */ + @Override + public void releaseConnection(final HttpClientConnection conn, + final Object newState, + final long validDuration, + final TimeUnit timeUnit) { +if (validDuration == 0) { + return; +} +if (conn.isOpen() && conn instanceof AbfsManagedApacheHttpConnection) { + HttpRoute route = ((AbfsManagedApacheHttpConnection) conn).getHttpRoute(); + if (route != null) { +kac.put(route, conn); + } +} + } + + @Override + public void connect(final HttpClientConnection conn, + final HttpRoute route, + final int connectTimeout, + final HttpContext context) throws IOException { +long start = System.currentTimeMill
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854377#comment-17854377 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636216799 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java: ## @@ -20,57 +20,51 @@ import java.io.IOException; import java.io.InputStream; -import java.io.OutputStream; import java.net.HttpURLConnection; -import java.net.ProtocolException; import java.net.URL; +import java.util.ArrayList; import java.util.List; - -import javax.net.ssl.HttpsURLConnection; -import javax.net.ssl.SSLSocketFactory; - -import org.apache.hadoop.classification.VisibleForTesting; -import org.apache.hadoop.fs.azurebfs.utils.UriUtils; -import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; +import java.util.Map; import com.fasterxml.jackson.core.JsonFactory; import com.fasterxml.jackson.core.JsonParser; import com.fasterxml.jackson.core.JsonToken; import com.fasterxml.jackson.databind.ObjectMapper; import org.slf4j.Logger; -import org.slf4j.LoggerFactory; import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable; import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema; - -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EXPECT_100_JDK_ERROR; -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HUNDRED_CONTINUE; -import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.EXPECT; +import org.apache.hadoop.fs.azurebfs.utils.UriUtils; /** - * Represents an HTTP operation. + * Base Http operation class for orchestrating server IO calls. Child classes would + * define the certain orchestration implementation on the basis of network library used. + * + * For JDK netlib usage, the child class would be {@link AbfsJdkHttpOperation}. + * For ApacheHttpClient netlib usage, the child class would be {@link AbfsAHCHttpOperation}. + * */ -public class AbfsHttpOperation implements AbfsPerfLoggable { - private static final Logger LOG = LoggerFactory.getLogger(AbfsHttpOperation.class); +public abstract class AbfsHttpOperation implements AbfsPerfLoggable { Review Comment: Added. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854376#comment-17854376 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636216485 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsConnFactory.java: ## @@ -0,0 +1,38 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import org.apache.http.config.ConnectionConfig; +import org.apache.http.conn.ManagedHttpClientConnection; +import org.apache.http.conn.routing.HttpRoute; +import org.apache.http.impl.conn.ManagedHttpClientConnectionFactory; + +/** + * Custom implementation of {@link ManagedHttpClientConnectionFactory} and overrides + * {@link ManagedHttpClientConnectionFactory#create(HttpRoute, ConnectionConfig)} to return + * {@link AbfsManagedApacheHttpConnection}. + */ +public class AbfsConnFactory extends ManagedHttpClientConnectionFactory { Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854375#comment-17854375 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636215857 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsConnectionManager.java: ## @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; + +import org.apache.http.HttpClientConnection; +import org.apache.http.config.Registry; +import org.apache.http.config.SocketConfig; +import org.apache.http.conn.ConnectionPoolTimeoutException; +import org.apache.http.conn.ConnectionRequest; +import org.apache.http.conn.HttpClientConnectionManager; +import org.apache.http.conn.HttpClientConnectionOperator; +import org.apache.http.conn.routing.HttpRoute; +import org.apache.http.conn.socket.ConnectionSocketFactory; +import org.apache.http.impl.conn.DefaultHttpClientConnectionOperator; +import org.apache.http.impl.conn.ManagedHttpClientConnectionFactory; +import org.apache.http.protocol.HttpContext; + +/** + * AbfsConnectionManager is a custom implementation of {@link HttpClientConnectionManager}. + * This implementation manages connection-pooling heuristics and custom implementation + * of {@link ManagedHttpClientConnectionFactory}. + */ +class AbfsConnectionManager implements HttpClientConnectionManager { + + private final KeepAliveCache kac = KeepAliveCache.getInstance(); + + private final AbfsConnFactory httpConnectionFactory; + + private final HttpClientConnectionOperator connectionOperator; + + AbfsConnectionManager(Registry socketFactoryRegistry, + AbfsConnFactory connectionFactory) { +this.httpConnectionFactory = connectionFactory; +connectionOperator = new DefaultHttpClientConnectionOperator( +socketFactoryRegistry, null, null); + } + + @Override + public ConnectionRequest requestConnection(final HttpRoute route, + final Object state) { +return new ConnectionRequest() { Review Comment: taken. ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsConnectionManager.java: ## @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; + +import org.apache.http.HttpClientConnection; +import org.apache.http.config.Registry; +import org.apache.http.config.SocketConfig; +import org.apache.http.conn.ConnectionPoolTimeoutException; +import org.apache.http.conn.ConnectionRequest; +import org.apache.http.conn.HttpClientConnectionManager; +import org.apache.http.conn.HttpClientConnectionOperator; +import org.apache.http.conn.routing.HttpRoute; +import org.apache.http.conn.socket.ConnectionSocketFactory; +import org.apache.http.impl.conn.DefaultHttpClientConnectionOperator; +import org.apache.http.impl.conn.ManagedHttpClientConnectionFactory; +import org.apache.http.protocol.HttpContext; + +/** + * AbfsConnectionManager is a custom implementatio
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854374#comment-17854374 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636215574 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsConnectionManager.java: ## @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; + +import org.apache.http.HttpClientConnection; +import org.apache.http.config.Registry; +import org.apache.http.config.SocketConfig; +import org.apache.http.conn.ConnectionPoolTimeoutException; +import org.apache.http.conn.ConnectionRequest; +import org.apache.http.conn.HttpClientConnectionManager; +import org.apache.http.conn.HttpClientConnectionOperator; +import org.apache.http.conn.routing.HttpRoute; +import org.apache.http.conn.socket.ConnectionSocketFactory; +import org.apache.http.impl.conn.DefaultHttpClientConnectionOperator; +import org.apache.http.impl.conn.ManagedHttpClientConnectionFactory; +import org.apache.http.protocol.HttpContext; + +/** + * AbfsConnectionManager is a custom implementation of {@link HttpClientConnectionManager}. + * This implementation manages connection-pooling heuristics and custom implementation + * of {@link ManagedHttpClientConnectionFactory}. + */ +class AbfsConnectionManager implements HttpClientConnectionManager { + + private final KeepAliveCache kac = KeepAliveCache.getInstance(); + + private final AbfsConnFactory httpConnectionFactory; + + private final HttpClientConnectionOperator connectionOperator; + + AbfsConnectionManager(Registry socketFactoryRegistry, + AbfsConnFactory connectionFactory) { +this.httpConnectionFactory = connectionFactory; +connectionOperator = new DefaultHttpClientConnectionOperator( +socketFactoryRegistry, null, null); + } + + @Override + public ConnectionRequest requestConnection(final HttpRoute route, + final Object state) { +return new ConnectionRequest() { + @Override + public HttpClientConnection get(final long timeout, + final TimeUnit timeUnit) + throws InterruptedException, ExecutionException, + ConnectionPoolTimeoutException { +try { + HttpClientConnection clientConn = kac.get(route); + if (clientConn != null) { +return clientConn; + } + return httpConnectionFactory.create(route, null); +} catch (IOException ex) { + throw new ExecutionException(ex); +} + } + + @Override + public boolean cancel() { +return false; + } +}; + } + + /** + * Releases a connection for reuse. It can be reused only if validDuration is greater than 0. + * This method is called by {@link org.apache.http.impl.execchain} internal class `ConnectionHolder`. + * If it wants to reuse the connection, it will send a non-zero validDuration, else it will send 0. + * @param conn the connection to release + * @param newState the new state of the connection + * @param validDuration the duration for which the connection is valid + * @param timeUnit the time unit for the validDuration + */ + @Override + public void releaseConnection(final HttpClientConnection conn, + final Object newState, + final long validDuration, + final TimeUnit timeUnit) { +if (validDuration == 0) { + return; +} +if (conn.isOpen() && conn instanceof AbfsManagedApacheHttpConnection) { + HttpRoute route = ((AbfsManagedApacheHttpConnection) conn).getHttpRoute(); Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Iss
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854373#comment-17854373 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636213297 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsConnectionManager.java: ## @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; + +import org.apache.http.HttpClientConnection; +import org.apache.http.config.Registry; +import org.apache.http.config.SocketConfig; +import org.apache.http.conn.ConnectionPoolTimeoutException; +import org.apache.http.conn.ConnectionRequest; +import org.apache.http.conn.HttpClientConnectionManager; +import org.apache.http.conn.HttpClientConnectionOperator; +import org.apache.http.conn.routing.HttpRoute; +import org.apache.http.conn.socket.ConnectionSocketFactory; +import org.apache.http.impl.conn.DefaultHttpClientConnectionOperator; +import org.apache.http.impl.conn.ManagedHttpClientConnectionFactory; +import org.apache.http.protocol.HttpContext; + +/** + * AbfsConnectionManager is a custom implementation of {@link HttpClientConnectionManager}. Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854372#comment-17854372 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636212193 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsConnectionManager.java: ## @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; + +import org.apache.http.HttpClientConnection; +import org.apache.http.config.Registry; +import org.apache.http.config.SocketConfig; +import org.apache.http.conn.ConnectionPoolTimeoutException; +import org.apache.http.conn.ConnectionRequest; +import org.apache.http.conn.HttpClientConnectionManager; +import org.apache.http.conn.HttpClientConnectionOperator; +import org.apache.http.conn.routing.HttpRoute; +import org.apache.http.conn.socket.ConnectionSocketFactory; +import org.apache.http.impl.conn.DefaultHttpClientConnectionOperator; +import org.apache.http.impl.conn.ManagedHttpClientConnectionFactory; +import org.apache.http.protocol.HttpContext; + +/** + * AbfsConnectionManager is a custom implementation of {@link HttpClientConnectionManager}. + * This implementation manages connection-pooling heuristics and custom implementation + * of {@link ManagedHttpClientConnectionFactory}. + */ +class AbfsConnectionManager implements HttpClientConnectionManager { + + private final KeepAliveCache kac = KeepAliveCache.getInstance(); + + private final AbfsConnFactory httpConnectionFactory; + + private final HttpClientConnectionOperator connectionOperator; + + AbfsConnectionManager(Registry socketFactoryRegistry, + AbfsConnFactory connectionFactory) { +this.httpConnectionFactory = connectionFactory; +connectionOperator = new DefaultHttpClientConnectionOperator( Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opene
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854371#comment-17854371 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636211656 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsApacheHttpClient.java: ## @@ -0,0 +1,123 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; + +import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; +import org.apache.http.HttpResponse; +import org.apache.http.client.config.RequestConfig; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.config.Registry; +import org.apache.http.config.RegistryBuilder; +import org.apache.http.conn.socket.ConnectionSocketFactory; +import org.apache.http.conn.socket.PlainConnectionSocketFactory; +import org.apache.http.conn.ssl.SSLConnectionSocketFactory; +import org.apache.http.impl.client.CloseableHttpClient; +import org.apache.http.impl.client.HttpClientBuilder; +import org.apache.http.impl.client.HttpClients; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemUriSchemes.HTTPS_SCHEME; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemUriSchemes.HTTP_SCHEME; +import static org.apache.http.conn.ssl.SSLConnectionSocketFactory.getDefaultHostnameVerifier; + +final class AbfsApacheHttpClient { Review Comment: Added the javadocs. ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java: ## @@ -1217,7 +1226,7 @@ public AbfsRestOperation deletePath(final String path, final boolean recursive, this, HTTP_METHOD_DELETE, url, -requestHeaders); +requestHeaders, abfsConfiguration); Review Comment: makes sense. taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have o
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854369#comment-17854369 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636209861 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAHCHttpOperation.java: ## @@ -0,0 +1,374 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.net.URISyntaxException; +import java.net.URL; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.PathIOException; +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsApacheHttpExpect100Exception; +import org.apache.http.Header; +import org.apache.http.HttpEntity; +import org.apache.http.HttpResponse; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpDelete; +import org.apache.http.client.methods.HttpEntityEnclosingRequestBase; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.methods.HttpHead; +import org.apache.http.client.methods.HttpPatch; +import org.apache.http.client.methods.HttpPost; +import org.apache.http.client.methods.HttpPut; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.entity.ByteArrayEntity; +import org.apache.http.util.EntityUtils; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.APACHE_IMPL; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_DELETE; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_GET; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_HEAD; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PATCH; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_POST; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PUT; +import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID; +import static org.apache.http.entity.ContentType.TEXT_PLAIN; + +/** + * Implementation of {@link AbfsHttpOperation} for orchestrating server calls using + * Apache Http Client. + */ +public class AbfsAHCHttpOperation extends AbfsHttpOperation { + + private static final Logger LOG = LoggerFactory.getLogger( + AbfsAHCHttpOperation.class); + + private HttpRequestBase httpRequestBase; + + private HttpResponse httpResponse; + + private boolean connectionDisconnectedOnError = false; + + private final boolean isPayloadRequest; + + public AbfsAHCHttpOperation(final URL url, + final String method, + final List requestHeaders, + final int connectionTimeout, + final int readTimeout) { +super(LOG, url, method, requestHeaders, connectionTimeout, readTimeout); +this.isPayloadRequest = isPayloadRequest(method); + } + + @VisibleForTesting + AbfsManagedHttpClientContext setFinalAbfsClientContext() { +return new AbfsManagedHttpClientContext(); + } + + private boolean isPayloadRequest(final String method) { +return HTTP_METHOD_PUT.equals(method) || HTTP_METHOD_PATCH.equals(method) +|| HTTP_METHOD_POST.equals(method); + } + + @Override + protected InputStream getErrorStream() throws IOException { +HttpEntity entity = httpResponse.getEntity(); +if (entity == null) { + return null; +} +return entity.getContent(); + }
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854353#comment-17854353 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636160749 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAHCHttpOperation.java: ## @@ -0,0 +1,374 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.net.URISyntaxException; +import java.net.URL; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.PathIOException; +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsApacheHttpExpect100Exception; +import org.apache.http.Header; +import org.apache.http.HttpEntity; +import org.apache.http.HttpResponse; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpDelete; +import org.apache.http.client.methods.HttpEntityEnclosingRequestBase; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.methods.HttpHead; +import org.apache.http.client.methods.HttpPatch; +import org.apache.http.client.methods.HttpPost; +import org.apache.http.client.methods.HttpPut; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.entity.ByteArrayEntity; +import org.apache.http.util.EntityUtils; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.APACHE_IMPL; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_DELETE; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_GET; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_HEAD; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PATCH; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_POST; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PUT; +import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID; +import static org.apache.http.entity.ContentType.TEXT_PLAIN; + +/** + * Implementation of {@link AbfsHttpOperation} for orchestrating server calls using + * Apache Http Client. + */ +public class AbfsAHCHttpOperation extends AbfsHttpOperation { + + private static final Logger LOG = LoggerFactory.getLogger( + AbfsAHCHttpOperation.class); + + private HttpRequestBase httpRequestBase; + + private HttpResponse httpResponse; + + private boolean connectionDisconnectedOnError = false; + + private final boolean isPayloadRequest; + + public AbfsAHCHttpOperation(final URL url, + final String method, + final List requestHeaders, + final int connectionTimeout, + final int readTimeout) { +super(LOG, url, method, requestHeaders, connectionTimeout, readTimeout); +this.isPayloadRequest = isPayloadRequest(method); Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpCli
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854351#comment-17854351 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636158908 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAHCHttpOperation.java: ## @@ -0,0 +1,374 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.net.URISyntaxException; +import java.net.URL; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.PathIOException; +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsApacheHttpExpect100Exception; +import org.apache.http.Header; +import org.apache.http.HttpEntity; +import org.apache.http.HttpResponse; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpDelete; +import org.apache.http.client.methods.HttpEntityEnclosingRequestBase; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.methods.HttpHead; +import org.apache.http.client.methods.HttpPatch; +import org.apache.http.client.methods.HttpPost; +import org.apache.http.client.methods.HttpPut; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.entity.ByteArrayEntity; +import org.apache.http.util.EntityUtils; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.APACHE_IMPL; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_DELETE; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_GET; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_HEAD; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PATCH; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_POST; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PUT; +import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID; +import static org.apache.http.entity.ContentType.TEXT_PLAIN; + +/** + * Implementation of {@link AbfsHttpOperation} for orchestrating server calls using + * Apache Http Client. + */ +public class AbfsAHCHttpOperation extends AbfsHttpOperation { + + private static final Logger LOG = LoggerFactory.getLogger( + AbfsAHCHttpOperation.class); + + private HttpRequestBase httpRequestBase; + + private HttpResponse httpResponse; + + private boolean connectionDisconnectedOnError = false; + + private final boolean isPayloadRequest; + + public AbfsAHCHttpOperation(final URL url, + final String method, + final List requestHeaders, + final int connectionTimeout, + final int readTimeout) { +super(LOG, url, method, requestHeaders, connectionTimeout, readTimeout); +this.isPayloadRequest = isPayloadRequest(method); + } + + @VisibleForTesting + AbfsManagedHttpClientContext setFinalAbfsClientContext() { +return new AbfsManagedHttpClientContext(); + } + + private boolean isPayloadRequest(final String method) { +return HTTP_METHOD_PUT.equals(method) || HTTP_METHOD_PATCH.equals(method) +|| HTTP_METHOD_POST.equals(method); + } + + @Override + protected InputStream getErrorStream() throws IOException { +HttpEntity entity = httpResponse.getEntity(); +if (entity == null) { + return null; +} +return entity.getContent(); + }
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854349#comment-17854349 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636158484 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAHCHttpOperation.java: ## @@ -0,0 +1,374 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.net.URISyntaxException; +import java.net.URL; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.PathIOException; +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsApacheHttpExpect100Exception; +import org.apache.http.Header; +import org.apache.http.HttpEntity; +import org.apache.http.HttpResponse; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpDelete; +import org.apache.http.client.methods.HttpEntityEnclosingRequestBase; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.methods.HttpHead; +import org.apache.http.client.methods.HttpPatch; +import org.apache.http.client.methods.HttpPost; +import org.apache.http.client.methods.HttpPut; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.entity.ByteArrayEntity; +import org.apache.http.util.EntityUtils; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.APACHE_IMPL; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_DELETE; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_GET; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_HEAD; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PATCH; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_POST; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PUT; +import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID; +import static org.apache.http.entity.ContentType.TEXT_PLAIN; + +/** + * Implementation of {@link AbfsHttpOperation} for orchestrating server calls using + * Apache Http Client. + */ +public class AbfsAHCHttpOperation extends AbfsHttpOperation { + + private static final Logger LOG = LoggerFactory.getLogger( + AbfsAHCHttpOperation.class); + + private HttpRequestBase httpRequestBase; + + private HttpResponse httpResponse; + + private boolean connectionDisconnectedOnError = false; + + private final boolean isPayloadRequest; + + public AbfsAHCHttpOperation(final URL url, + final String method, + final List requestHeaders, + final int connectionTimeout, + final int readTimeout) { +super(LOG, url, method, requestHeaders, connectionTimeout, readTimeout); +this.isPayloadRequest = isPayloadRequest(method); + } + + @VisibleForTesting + AbfsManagedHttpClientContext setFinalAbfsClientContext() { +return new AbfsManagedHttpClientContext(); + } + + private boolean isPayloadRequest(final String method) { +return HTTP_METHOD_PUT.equals(method) || HTTP_METHOD_PATCH.equals(method) +|| HTTP_METHOD_POST.equals(method); + } + + @Override + protected InputStream getErrorStream() throws IOException { +HttpEntity entity = httpResponse.getEntity(); +if (entity == null) { + return null; +} +return entity.getContent(); + }
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854347#comment-17854347 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636156126 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAHCHttpOperation.java: ## @@ -0,0 +1,374 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.net.URISyntaxException; +import java.net.URL; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.PathIOException; +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsApacheHttpExpect100Exception; +import org.apache.http.Header; +import org.apache.http.HttpEntity; +import org.apache.http.HttpResponse; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpDelete; +import org.apache.http.client.methods.HttpEntityEnclosingRequestBase; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.methods.HttpHead; +import org.apache.http.client.methods.HttpPatch; +import org.apache.http.client.methods.HttpPost; +import org.apache.http.client.methods.HttpPut; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.entity.ByteArrayEntity; +import org.apache.http.util.EntityUtils; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.APACHE_IMPL; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_DELETE; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_GET; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_HEAD; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PATCH; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_POST; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PUT; +import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID; +import static org.apache.http.entity.ContentType.TEXT_PLAIN; + +/** + * Implementation of {@link AbfsHttpOperation} for orchestrating server calls using + * Apache Http Client. + */ +public class AbfsAHCHttpOperation extends AbfsHttpOperation { + + private static final Logger LOG = LoggerFactory.getLogger( + AbfsAHCHttpOperation.class); + + private HttpRequestBase httpRequestBase; + + private HttpResponse httpResponse; + + private boolean connectionDisconnectedOnError = false; + + private final boolean isPayloadRequest; + + public AbfsAHCHttpOperation(final URL url, + final String method, + final List requestHeaders, + final int connectionTimeout, + final int readTimeout) { +super(LOG, url, method, requestHeaders, connectionTimeout, readTimeout); +this.isPayloadRequest = isPayloadRequest(method); + } + + @VisibleForTesting + AbfsManagedHttpClientContext setFinalAbfsClientContext() { +return new AbfsManagedHttpClientContext(); + } + + private boolean isPayloadRequest(final String method) { +return HTTP_METHOD_PUT.equals(method) || HTTP_METHOD_PATCH.equals(method) +|| HTTP_METHOD_POST.equals(method); + } + + @Override + protected InputStream getErrorStream() throws IOException { +HttpEntity entity = httpResponse.getEntity(); +if (entity == null) { + return null; +} +return entity.getContent(); + }
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854348#comment-17854348 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636156890 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAHCHttpOperation.java: ## @@ -0,0 +1,374 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.net.URISyntaxException; +import java.net.URL; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.PathIOException; +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsApacheHttpExpect100Exception; +import org.apache.http.Header; +import org.apache.http.HttpEntity; +import org.apache.http.HttpResponse; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpDelete; +import org.apache.http.client.methods.HttpEntityEnclosingRequestBase; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.methods.HttpHead; +import org.apache.http.client.methods.HttpPatch; +import org.apache.http.client.methods.HttpPost; +import org.apache.http.client.methods.HttpPut; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.entity.ByteArrayEntity; +import org.apache.http.util.EntityUtils; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.APACHE_IMPL; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_DELETE; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_GET; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_HEAD; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PATCH; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_POST; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PUT; +import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID; +import static org.apache.http.entity.ContentType.TEXT_PLAIN; + +/** + * Implementation of {@link AbfsHttpOperation} for orchestrating server calls using Review Comment: Make sense, have added javadocs in the new fields and non-inherited methods. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementatio
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854346#comment-17854346 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636155617 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAHCHttpOperation.java: ## @@ -0,0 +1,374 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.net.URISyntaxException; +import java.net.URL; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.PathIOException; +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsApacheHttpExpect100Exception; +import org.apache.http.Header; +import org.apache.http.HttpEntity; +import org.apache.http.HttpResponse; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpDelete; +import org.apache.http.client.methods.HttpEntityEnclosingRequestBase; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.methods.HttpHead; +import org.apache.http.client.methods.HttpPatch; +import org.apache.http.client.methods.HttpPost; +import org.apache.http.client.methods.HttpPut; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.entity.ByteArrayEntity; +import org.apache.http.util.EntityUtils; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.APACHE_IMPL; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_DELETE; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_GET; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_HEAD; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PATCH; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_POST; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PUT; +import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID; +import static org.apache.http.entity.ContentType.TEXT_PLAIN; + +/** + * Implementation of {@link AbfsHttpOperation} for orchestrating server calls using + * Apache Http Client. + */ +public class AbfsAHCHttpOperation extends AbfsHttpOperation { + + private static final Logger LOG = LoggerFactory.getLogger( + AbfsAHCHttpOperation.class); + + private HttpRequestBase httpRequestBase; Review Comment: Added javadocs for the fields and new method. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a conn
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854345#comment-17854345 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636154364 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/contracts/exceptions/AbfsApacheHttpExpect100Exception.java: ## @@ -0,0 +1,36 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.contracts.exceptions; + +import java.io.IOException; + +import org.apache.http.HttpResponse; + +public class AbfsApacheHttpExpect100Exception extends IOException { Review Comment: Make sense, have added a new class HttpResponseException and made AbfsApacheHttpExpect100Exception child of it. ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/contracts/exceptions/AbfsApacheHttpExpect100Exception.java: ## @@ -0,0 +1,36 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.contracts.exceptions; + +import java.io.IOException; + +import org.apache.http.HttpResponse; + +public class AbfsApacheHttpExpect100Exception extends IOException { + private final HttpResponse httpResponse; + + public AbfsApacheHttpExpect100Exception(final String s, final HttpResponse httpResponse) { +super(s); +this.httpResponse = httpResponse; Review Comment: Taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For app
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854344#comment-17854344 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636152295 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/AbfsHttpConstants.java: ## @@ -165,5 +165,12 @@ public static ApiVersion getCurrentVersion() { */ public static final Integer HTTP_STATUS_CATEGORY_QUOTIENT = 100; + public static final String HTTP_MAX_CONN_SYS_PROP = "http.maxConnections"; Review Comment: great advice, added. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854342#comment-17854342 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636151400 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/AbfsHttpConstants.java: ## @@ -165,5 +165,12 @@ public static ApiVersion getCurrentVersion() { */ public static final Integer HTTP_STATUS_CATEGORY_QUOTIENT = 100; + public static final String HTTP_MAX_CONN_SYS_PROP = "http.maxConnections"; + public static final Integer DEFAULT_MAX_CONN_SYS_PROP = 5; + public static final int KAC_DEFAULT_CONN_TTL = 5_000; Review Comment: this field is removed. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854343#comment-17854343 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636151835 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/AbfsHttpConstants.java: ## @@ -165,5 +165,12 @@ public static ApiVersion getCurrentVersion() { */ public static final Integer HTTP_STATUS_CATEGORY_QUOTIENT = 100; + public static final String HTTP_MAX_CONN_SYS_PROP = "http.maxConnections"; + public static final Integer DEFAULT_MAX_CONN_SYS_PROP = 5; Review Comment: A higher cache size showed increased connection establishment time. Hence, not changing the default and keep it equal to what jdk has. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854341#comment-17854341 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636150107 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemConfigurations.java: ## @@ -167,5 +167,14 @@ public final class FileSystemConfigurations { public static final int HUNDRED = 100; public static final long THOUSAND = 1000L; + public static final HttpOperationType DEFAULT_NETWORKING_LIBRARY + = HttpOperationType.APACHE_HTTP_CLIENT; + + public static final int DEFAULT_APACHE_HTTP_CLIENT_MAX_IO_EXCEPTION_RETRIES = 3; + + public static final long DEFAULT_HTTP_CLIENT_CONN_MAX_IDLE_TIME = 5_000L; + + public static final int DEFAULT_HTTP_CLIENT_CONN_MAX_IDLE_CONNECTIONS = 5; Review Comment: A higher cache size showed increased connection establishment time. Hence, not changing the default and keep it equal to what jdk has. Developer has the config `fs.azure.apache.http.client.max.cache.connection.size` in case they want to increase the cache size. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854340#comment-17854340 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636147376 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/ConfigurationKeys.java: ## @@ -316,5 +316,10 @@ public static String accountProperty(String property, String account) { * @see FileSystem#openFile(org.apache.hadoop.fs.Path) */ public static final String FS_AZURE_BUFFERED_PREAD_DISABLE = "fs.azure.buffered.pread.disable"; + /**Defines what network library to use for server IO calls {@value }*/ + public static final String FS_AZURE_NETWORKING_LIBRARY = "fs.azure.networking.library"; + public static final String FS_AZURE_APACHE_HTTP_CLIENT_MAX_IO_EXCEPTION_RETRIES = "fs.azure.apache.http.client.max.io.exception.retries"; Review Comment: Have added the javadocs. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853930#comment-17853930 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2159972430 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 36s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 20 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 48m 38s | | trunk passed | | +1 :green_heart: | compile | 0m 40s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 35s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 33s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 41s | | trunk passed | | +1 :green_heart: | javadoc | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 39m 48s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 40m 10s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 31s | | the patch passed | | +1 :green_heart: | compile | 0m 31s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 31s | | the patch passed | | +1 :green_heart: | compile | 0m 29s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 29s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/58/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 2 new + 18 unchanged - 0 fixed = 20 total (was 18) | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | the patch passed | | +1 :green_heart: | shadedclient | 37m 49s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 30s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 140m 58s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/58/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 524748d754f1 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d2113456341d9a709c1fbb03f31460fc02fc2819 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/58/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853689#comment-17853689 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2158478028 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 20 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 48s | | trunk passed | | +1 :green_heart: | compile | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 34s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 31s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 40s | | trunk passed | | +1 :green_heart: | javadoc | 0m 37s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 30s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 3s | | trunk passed | | +1 :green_heart: | shadedclient | 34m 32s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 55s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 31s | | the patch passed | | +1 :green_heart: | compile | 0m 32s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 32s | | the patch passed | | +1 :green_heart: | compile | 0m 28s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 28s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 19s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/57/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 6 new + 18 unchanged - 0 fixed = 24 total (was 18) | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | -1 :x: | javadoc | 0m 25s | [/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/57/artifact/out/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-azure in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | javadoc | 0m 23s | [/patch-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/57/artifact/out/patch-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | hadoop-azure in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | -1 :x: | spotbugs | 1m 8s | [/new-spotbugs-hadoop-tools_hadoop-azure.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/57/artifact/out/new-spotbugs-hadoop-tools_hadoop-azure.html) | hadoop-tools/hadoop-azure generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 34m 51s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 28s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | | The patch does not generate ASF License warnings. | | | | 128m 23s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-tools/hadoop-azure | | | org.apache.hadoop.fs.azurebfs.services.KeepAliveCache defines equals but not hashCode At KeepAliveCache.java:hashCode At KeepAliveCache.java:[lines 235-244] |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853684#comment-17853684 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2158467800 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 42s | | trunk passed | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 32s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 32s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 39s | | trunk passed | | +1 :green_heart: | javadoc | 0m 36s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 34s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 8s | | trunk passed | | +1 :green_heart: | shadedclient | 34m 34s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 54s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 20s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/56/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 6 new + 18 unchanged - 0 fixed = 24 total (was 18) | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | -1 :x: | javadoc | 0m 26s | [/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/56/artifact/out/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-azure in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | javadoc | 0m 25s | [/patch-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/56/artifact/out/patch-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | hadoop-azure in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | -1 :x: | spotbugs | 1m 15s | [/new-spotbugs-hadoop-tools_hadoop-azure.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/56/artifact/out/new-spotbugs-hadoop-tools_hadoop-azure.html) | hadoop-tools/hadoop-azure generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 34m 51s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 23s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 34s | | The patch does not generate ASF License warnings. | | | | 127m 59s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-tools/hadoop-azure | | | org.apache.hadoop.fs.azurebfs.services.KeepAliveCache defines equals but not hashCode At KeepAliveCache.java:hashCode At KeepAliveCache.java:[lines 235-244] |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853603#comment-17853603 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2157899785 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 32s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 59s | | trunk passed | | +1 :green_heart: | compile | 0m 36s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 34s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 30s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 39s | | trunk passed | | +1 :green_heart: | javadoc | 0m 37s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 32s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 3s | | trunk passed | | +1 :green_heart: | shadedclient | 35m 2s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 35m 21s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 19s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/55/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 17 new + 18 unchanged - 0 fixed = 35 total (was 18) | | +1 :green_heart: | mvnsite | 0m 30s | | the patch passed | | -1 :x: | javadoc | 0m 25s | [/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/55/artifact/out/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-azure in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | javadoc | 0m 24s | [/patch-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/55/artifact/out/patch-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | hadoop-azure in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | -1 :x: | spotbugs | 1m 7s | [/new-spotbugs-hadoop-tools_hadoop-azure.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/55/artifact/out/new-spotbugs-hadoop-tools_hadoop-azure.html) | hadoop-tools/hadoop-azure generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) | | +1 :green_heart: | shadedclient | 34m 4s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 33s | | hadoop-azure in the patch passed. | | -1 :x: | asflicense | 0m 38s | [/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/55/artifact/out/results-asflicense.txt) | The patch generated 1 ASF License warnings. | | | | 127m 57s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-tools/hadoop-azure | | | org.apache.hadoop.fs.azurebfs.services.KeepAliveCac
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853599#comment-17853599 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2157882315 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 54s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 53s | | trunk passed | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 31s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 38s | | trunk passed | | +1 :green_heart: | javadoc | 0m 37s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 33s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 3s | | trunk passed | | +1 :green_heart: | shadedclient | 34m 43s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 35m 3s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 29s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 26s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 20s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/54/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 17 new + 18 unchanged - 0 fixed = 35 total (was 18) | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | -1 :x: | javadoc | 0m 26s | [/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/54/artifact/out/patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-azure in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | javadoc | 0m 25s | [/patch-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/54/artifact/out/patch-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | hadoop-azure in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | -1 :x: | spotbugs | 1m 12s | [/new-spotbugs-hadoop-tools_hadoop-azure.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/54/artifact/out/new-spotbugs-hadoop-tools_hadoop-azure.html) | hadoop-tools/hadoop-azure generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) | | +1 :green_heart: | shadedclient | 34m 44s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 26s | | hadoop-azure in the patch passed. | | -1 :x: | asflicense | 0m 36s | [/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/54/artifact/out/results-asflicense.txt) | The patch generated 2 ASF License warnings. | | | | 128m 20s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-tools/hadoop-azure | | | org.apache.hadoop.fs.azurebfs.services.KeepAliveCac
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851565#comment-17851565 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2144749670 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 15m 59s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 46m 32s | | trunk passed | | +1 :green_heart: | compile | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 34s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 43s | | trunk passed | | +1 :green_heart: | javadoc | 0m 40s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 5s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 27s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 33m 49s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 31s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 31s | | the patch passed | | +1 :green_heart: | compile | 0m 28s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 28s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 20s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/53/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 2 new + 18 unchanged - 0 fixed = 20 total (was 18) | | +1 :green_heart: | mvnsite | 0m 32s | | the patch passed | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 8s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 42s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 25s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | | The patch does not generate ASF License warnings. | | | | 147m 5s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/53/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 10c157050ecc 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / bdbbd5d1624922c4925ead21571b4b278f24e360 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/53/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851138#comment-17851138 ] ASF GitHub Bot commented on HADOOP-19120: - steveloughran commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1621368582 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemConfigurations.java: ## @@ -167,5 +167,14 @@ public final class FileSystemConfigurations { public static final int HUNDRED = 100; public static final long THOUSAND = 1000L; + public static final HttpOperationType DEFAULT_NETWORKING_LIBRARY + = HttpOperationType.APACHE_HTTP_CLIENT; + + public static final int DEFAULT_APACHE_HTTP_CLIENT_MAX_IO_EXCEPTION_RETRIES = 3; + + public static final long DEFAULT_HTTP_CLIENT_CONN_MAX_IDLE_TIME = 5_000L; + + public static final int DEFAULT_HTTP_CLIENT_CONN_MAX_IDLE_CONNECTIONS = 5; Review Comment: that's quite a low number, isn't it? ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/contracts/exceptions/AbfsApacheHttpExpect100Exception.java: ## @@ -0,0 +1,36 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.contracts.exceptions; + +import java.io.IOException; + +import org.apache.http.HttpResponse; + +public class AbfsApacheHttpExpect100Exception extends IOException { + private final HttpResponse httpResponse; + + public AbfsApacheHttpExpect100Exception(final String s, final HttpResponse httpResponse) { +super(s); +this.httpResponse = httpResponse; Review Comment: pull this into the proposed superclass, add a requireNonNull() ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsConnectionManager.java: ## @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; + +import org.apache.http.HttpClientConnection; +import org.apache.http.config.Registry; +import org.apache.http.config.SocketConfig; +import org.apache.http.conn.ConnectionPoolTimeoutException; +import org.apache.http.conn.ConnectionRequest; +import org.apache.http.conn.HttpClientConnectionManager; +import org.apache.http.conn.HttpClientConnectionOperator; +import org.apache.http.conn.routing.HttpRoute; +import org.apache.http.conn.socket.ConnectionSocketFactory; +import org.apache.http.impl.conn.DefaultHttpClientConnectionOperator; +import org.apache.http.impl.conn.ManagedHttpClientConnectionFactory; +import org.apache.http.protocol.HttpContext; + +/** + * AbfsConnectionManager is a custom implementation of {@link HttpClientConnectionManager}. + * This implementation manages connection-pooling heuristics and custom implementation + * of {@link ManagedHttpClientConnectionFactory}. + */ +class AbfsConnectionManager implements HttpClientConnectionManager { + + private final KeepAliveCache kac = KeepAliveCache.getInstance(); + + private final AbfsConnFactory httpConnectionFactory; + + private final HttpClientConnectionOperator connectionOperator; + + AbfsConnectionManager(Registry socketFactoryRegistry, + AbfsConnFactory connectionFactory) { +this.httpC
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850364#comment-17850364 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2137377073 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 32s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 9s | | trunk passed | | +1 :green_heart: | compile | 0m 40s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 33s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 41s | | trunk passed | | +1 :green_heart: | javadoc | 0m 40s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 33s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 33m 54s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 31s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/52/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 2 new + 18 unchanged - 0 fixed = 20 total (was 18) | | +1 :green_heart: | mvnsite | 0m 30s | | the patch passed | | +1 :green_heart: | javadoc | 0m 28s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 17s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 20s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | | The patch does not generate ASF License warnings. | | | | 128m 36s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/52/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 9927f4158ffe 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / e1dc97ea2104fd89adfe3401d0061558f51cf27f | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/52/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850330#comment-17850330 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2137197512 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 31m 27s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 50m 3s | | trunk passed | | +1 :green_heart: | compile | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 31s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 40s | | trunk passed | | +1 :green_heart: | javadoc | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 34s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | trunk passed | | +1 :green_heart: | shadedclient | 39m 22s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 39m 43s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 32s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 32s | | the patch passed | | +1 :green_heart: | compile | 0m 28s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 28s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 20s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/51/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 5 new + 18 unchanged - 0 fixed = 23 total (was 18) | | +1 :green_heart: | mvnsite | 0m 30s | | the patch passed | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 25s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 38m 56s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 21s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 34s | | The patch does not generate ASF License warnings. | | | | 176m 55s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/51/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux e6b5b97c7fe0 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0c5fbdc4fdb7849e92ac74751949c3c33d4c24a1 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/51/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850321#comment-17850321 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2137116181 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 11m 51s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 43m 44s | | trunk passed | | +1 :green_heart: | compile | 0m 40s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 34s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 33s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 40s | | trunk passed | | +1 :green_heart: | javadoc | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 34s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 9s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 38s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 33m 59s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 28s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 28s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/50/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 5 new + 18 unchanged - 0 fixed = 23 total (was 18) | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 24s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 13s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 22s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | | The patch does not generate ASF License warnings. | | | | 139m 25s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/50/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 8af964d4d9f8 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0c5fbdc4fdb7849e92ac74751949c3c33d4c24a1 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/50/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850314#comment-17850314 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2137005420 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 32s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 45m 0s | | trunk passed | | +1 :green_heart: | compile | 0m 35s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 34s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 29s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 39s | | trunk passed | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 32s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 4s | | trunk passed | | +1 :green_heart: | shadedclient | 34m 23s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 42s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 28s | | the patch passed | | +1 :green_heart: | compile | 0m 29s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 25s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 25s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 19s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/49/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 12 new + 18 unchanged - 0 fixed = 30 total (was 18) | | +1 :green_heart: | mvnsite | 0m 28s | | the patch passed | | +1 :green_heart: | javadoc | 0m 24s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 24s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 4s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 56s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 2m 23s | [/patch-unit-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/49/artifact/out/patch-unit-hadoop-tools_hadoop-azure.txt) | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 130m 37s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.fs.azurebfs.services.TestApacheHttpClientFallback | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/49/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux db1fa28f98b8 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / bfe7ab90f9db8c948b8f36c940f09e069d393de4 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Mu
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850311#comment-17850311 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2136988874 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 32s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 27s | | trunk passed | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 32s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 32s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 37s | | trunk passed | | +1 :green_heart: | javadoc | 0m 38s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 32s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 4s | | trunk passed | | +1 :green_heart: | shadedclient | 34m 5s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 24s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 29s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 26s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 19s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/48/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 12 new + 18 unchanged - 0 fixed = 30 total (was 18) | | +1 :green_heart: | mvnsite | 0m 29s | | the patch passed | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 24s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 3s | | the patch passed | | +1 :green_heart: | shadedclient | 34m 53s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 2m 21s | [/patch-unit-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/48/artifact/out/patch-unit-hadoop-tools_hadoop-azure.txt) | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 35s | | The patch does not generate ASF License warnings. | | | | 130m 56s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.fs.azurebfs.services.TestApacheHttpClientFallback | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/48/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 61a202c88da0 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / fd105ff693e18fc33aeacebb6583dca10476 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Mu
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850285#comment-17850285 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1618363938 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java: ## @@ -20,370 +20,515 @@ import java.io.IOException; import java.io.InputStream; -import java.io.OutputStream; import java.net.HttpURLConnection; -import java.net.ProtocolException; import java.net.URL; +import java.util.ArrayList; import java.util.List; import java.util.Map; -import javax.net.ssl.HttpsURLConnection; -import javax.net.ssl.SSLSocketFactory; - -import org.apache.hadoop.classification.VisibleForTesting; -import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; - +import com.fasterxml.jackson.core.JsonFactory; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.core.JsonToken; +import com.fasterxml.jackson.databind.ObjectMapper; import org.slf4j.Logger; -import org.slf4j.LoggerFactory; import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; - -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EXPECT_100_JDK_ERROR; -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HUNDRED_CONTINUE; -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.JDK_FALLBACK; -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.JDK_IMPL; -import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.EXPECT; +import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable; +import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema; +import org.apache.hadoop.fs.azurebfs.utils.UriUtils; /** - * Implementation of {@link HttpOperation} for orchestrating calls using JDK's HttpURLConnection. + * Base Http operation class for orchestrating server IO calls. Child classes would + * define the certain orchestration implementation on the basis of network library used. + * + * For JDK netlib usage, the child class would be {@link AbfsJdkHttpOperation}. + * For ApacheHttpClient netlib usage, the child class would be {@link AbfsAHCHttpOperation}. + * */ -public class AbfsHttpOperation extends HttpOperation { +public abstract class AbfsHttpOperation implements AbfsPerfLoggable { + + private final Logger log; + + private static final int CLEAN_UP_BUFFER_SIZE = 64 * 1024; + + private static final int ONE_THOUSAND = 1000; + + private static final int ONE_MILLION = ONE_THOUSAND * ONE_THOUSAND; + + private final String method; + private final URL url; + private String maskedUrl; + private String maskedEncodedUrl; + private int statusCode; + private String statusDescription; + private String storageErrorCode = ""; + private String storageErrorMessage = ""; + private String requestId = ""; + private String expectedAppendPos = ""; + private ListResultSchema listResultSchema = null; + + // metrics + private int bytesSent; + private int expectedBytesToBeSent; + private long bytesReceived; - private static final Logger LOG = LoggerFactory.getLogger( - AbfsHttpOperation.class); + private long connectionTimeMs; + private long sendRequestTimeMs; + private long recvResponseTimeMs; + private boolean shouldMask = false; - private HttpURLConnection connection; + private final List requestHeaders; - private boolean connectionDisconnectedOnError = false; + private final int connectionTimeout, readTimeout; - public static AbfsHttpOperation getAbfsHttpOperationWithFixedResult( + public AbfsHttpOperation(Logger logger, final URL url, final String method, final int httpStatus) { -AbfsHttpOperationWithFixedResult httpOp -= new AbfsHttpOperationWithFixedResult(url, method, httpStatus); -return httpOp; +this.log = logger; +this.url = url; +this.method = method; +this.statusCode = httpStatus; Review Comment: This is no more required, as there was hardSet done on children. Now, it will directly happen on the parent class. The changed code of this block is reverted to that as of trunk now. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache Http
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850284#comment-17850284 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1618362874 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingIntercept.java: ## @@ -124,23 +125,24 @@ static AbfsClientThrottlingIntercept initializeSingleton(AbfsConfiguration abfsC * @return true if the operation is throttled and has some bytes to transfer. */ private boolean updateBytesTransferred(boolean isThrottledOperation, - HttpOperation abfsHttpOperation) { + AbfsHttpOperation abfsHttpOperation) { return isThrottledOperation && abfsHttpOperation.getExpectedBytesToBeSent() > 0; } /** * Updates the metrics for successful and failed read and write operations. + * * @param operationType Only applicable for read and write operations. - * @param abfsHttpOperation Used for status code and data transferred. + * @param httpOperation Used for status code and data transferred. Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17850281#comment-17850281 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2136737338 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 15m 4s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 43m 38s | | trunk passed | | +1 :green_heart: | compile | 0m 38s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 34s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 42s | | trunk passed | | +1 :green_heart: | javadoc | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 28s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 33m 50s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 31s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/47/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 12 new + 18 unchanged - 0 fixed = 30 total (was 18) | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | +1 :green_heart: | javadoc | 0m 28s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 42s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 2m 21s | [/patch-unit-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/47/artifact/out/patch-unit-hadoop-tools_hadoop-azure.txt) | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 36s | | The patch does not generate ASF License warnings. | | | | 142m 55s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.fs.azurebfs.services.TestApacheHttpClientFallback | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/47/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux ce0cfcbb168e 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 9cec3e7df6fd5a87e03f003c1173ec59e4648edb | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Mu
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836595#comment-17836595 ] ASF GitHub Bot commented on HADOOP-19120: - anmolanmol1234 commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1562502863 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java: ## @@ -20,370 +20,515 @@ import java.io.IOException; import java.io.InputStream; -import java.io.OutputStream; import java.net.HttpURLConnection; -import java.net.ProtocolException; import java.net.URL; +import java.util.ArrayList; import java.util.List; import java.util.Map; -import javax.net.ssl.HttpsURLConnection; -import javax.net.ssl.SSLSocketFactory; - -import org.apache.hadoop.classification.VisibleForTesting; -import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; - +import com.fasterxml.jackson.core.JsonFactory; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.core.JsonToken; +import com.fasterxml.jackson.databind.ObjectMapper; import org.slf4j.Logger; -import org.slf4j.LoggerFactory; import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; - -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EXPECT_100_JDK_ERROR; -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HUNDRED_CONTINUE; -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.JDK_FALLBACK; -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.JDK_IMPL; -import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.EXPECT; +import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable; +import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema; +import org.apache.hadoop.fs.azurebfs.utils.UriUtils; /** - * Implementation of {@link HttpOperation} for orchestrating calls using JDK's HttpURLConnection. + * Base Http operation class for orchestrating server IO calls. Child classes would + * define the certain orchestration implementation on the basis of network library used. + * + * For JDK netlib usage, the child class would be {@link AbfsJdkHttpOperation}. + * For ApacheHttpClient netlib usage, the child class would be {@link AbfsAHCHttpOperation}. + * */ -public class AbfsHttpOperation extends HttpOperation { +public abstract class AbfsHttpOperation implements AbfsPerfLoggable { + + private final Logger log; + + private static final int CLEAN_UP_BUFFER_SIZE = 64 * 1024; + + private static final int ONE_THOUSAND = 1000; + + private static final int ONE_MILLION = ONE_THOUSAND * ONE_THOUSAND; + + private final String method; + private final URL url; + private String maskedUrl; + private String maskedEncodedUrl; + private int statusCode; + private String statusDescription; + private String storageErrorCode = ""; + private String storageErrorMessage = ""; + private String requestId = ""; + private String expectedAppendPos = ""; + private ListResultSchema listResultSchema = null; + + // metrics + private int bytesSent; + private int expectedBytesToBeSent; + private long bytesReceived; - private static final Logger LOG = LoggerFactory.getLogger( - AbfsHttpOperation.class); + private long connectionTimeMs; + private long sendRequestTimeMs; + private long recvResponseTimeMs; + private boolean shouldMask = false; - private HttpURLConnection connection; + private final List requestHeaders; - private boolean connectionDisconnectedOnError = false; + private final int connectionTimeout, readTimeout; - public static AbfsHttpOperation getAbfsHttpOperationWithFixedResult( + public AbfsHttpOperation(Logger logger, final URL url, final String method, final int httpStatus) { -AbfsHttpOperationWithFixedResult httpOp -= new AbfsHttpOperationWithFixedResult(url, method, httpStatus); -return httpOp; +this.log = logger; +this.url = url; +this.method = method; +this.statusCode = httpStatus; Review Comment: shouldn't the status code come from connection response ? > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS cur
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836577#comment-17836577 ] ASF GitHub Bot commented on HADOOP-19120: - anmolanmol1234 commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1562462047 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingIntercept.java: ## @@ -124,23 +125,24 @@ static AbfsClientThrottlingIntercept initializeSingleton(AbfsConfiguration abfsC * @return true if the operation is throttled and has some bytes to transfer. */ private boolean updateBytesTransferred(boolean isThrottledOperation, - HttpOperation abfsHttpOperation) { + AbfsHttpOperation abfsHttpOperation) { return isThrottledOperation && abfsHttpOperation.getExpectedBytesToBeSent() > 0; } /** * Updates the metrics for successful and failed read and write operations. + * * @param operationType Only applicable for read and write operations. - * @param abfsHttpOperation Used for status code and data transferred. + * @param httpOperation Used for status code and data transferred. Review Comment: nit : param name can be kept abfsHttpOperation only > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836550#comment-17836550 ] ASF GitHub Bot commented on HADOOP-19120: - anmolanmol1234 commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1562362652 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAHCHttpOperation.java: ## @@ -0,0 +1,422 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.net.URISyntaxException; +import java.net.URL; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.azurebfs.AbfsConfiguration; +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsApacheHttpExpect100Exception; +import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; +import org.apache.http.Header; +import org.apache.http.HttpEntity; +import org.apache.http.HttpResponse; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpDelete; +import org.apache.http.client.methods.HttpEntityEnclosingRequestBase; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.methods.HttpHead; +import org.apache.http.client.methods.HttpPatch; +import org.apache.http.client.methods.HttpPost; +import org.apache.http.client.methods.HttpPut; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.entity.ByteArrayEntity; +import org.apache.http.util.EntityUtils; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.APACHE_IMPL; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_DELETE; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_GET; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_HEAD; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PATCH; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_POST; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PUT; +import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID; +import static org.apache.http.entity.ContentType.TEXT_PLAIN; + +/** + * Implementation of {@link HttpOperation} for orchestrating server calls using + * Apache Http Client. + */ +public class AbfsAHCHttpOperation extends HttpOperation { + + private static final Logger LOG = LoggerFactory.getLogger( + AbfsAHCHttpOperation.class); + + /** + * Map to store the AbfsApacheHttpClient. Each instance of AbfsClient to have + * a unique AbfsApacheHttpClient instance. The key of the map is the UUID of the client. + */ + private static final Map + ABFS_APACHE_HTTP_CLIENT_MAP = new HashMap<>(); + + private AbfsApacheHttpClient abfsApacheHttpClient; + + private HttpRequestBase httpRequestBase; + + private HttpResponse httpResponse; + + private AbfsManagedHttpContext abfsHttpClientContext; + + private final AbfsRestOperationType abfsRestOperationType; + + private boolean connectionDisconnectedOnError = false; + + private AbfsApacheHttpExpect100Exception abfsApacheHttpExpect100Exception; + + private final boolean isPayloadRequest; + + private List requestHeaders; + + private AbfsAHCHttpOperation(final URL url, + final String method, + final List requestHeaders, + final AbfsRestOperationType abfsRestOperationType) { +super(LOG, url, method); +this.abfsRestOperationType = abfsRestOperationType; +this.requestHeaders = requestHeaders; +this.isPaylo
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833550#comment-17833550 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2034501601 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 36s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 20 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 45m 10s | | trunk passed | | +1 :green_heart: | compile | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 35s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 34s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 42s | | trunk passed | | +1 :green_heart: | javadoc | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 43s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 5s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 28s | | the patch passed | | +1 :green_heart: | compile | 0m 29s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/46/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 12 new + 18 unchanged - 0 fixed = 30 total (was 18) | | +1 :green_heart: | mvnsite | 0m 29s | | the patch passed | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 5s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 33s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 27s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 130m 10s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/46/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 436b44784dd0 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / fad4628f478b4e06fbb896303457408a66789c59 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/46/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833503#comment-17833503 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2034302508 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 34s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 20 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 46m 41s | | trunk passed | | +1 :green_heart: | compile | 0m 38s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 35s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 32s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 41s | | trunk passed | | +1 :green_heart: | javadoc | 0m 40s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 54s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 13s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 31s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 31s | | the patch passed | | +1 :green_heart: | compile | 0m 25s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 25s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 20s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/45/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 12 new + 18 unchanged - 0 fixed = 30 total (was 18) | | +1 :green_heart: | mvnsite | 0m 29s | | the patch passed | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 24s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 4s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 37s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 22s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 36s | | The patch does not generate ASF License warnings. | | | | 131m 9s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/45/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 2afd62b1519c 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 21e1200c6de8594e9816721058eeb9bf0624a4d3 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/45/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833493#comment-17833493 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1549455593 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/ApacheHttpClientHealthMonitor.java: ## @@ -0,0 +1,33 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + + +public final class ApacheHttpClientHealthMonitor { Review Comment: Have removed the Monitor class, and have kept logic in the AbfsApacheHttpClient. I am not inclined to keep the new classes as internal for following reasons: 1. the classes are big and internalizing them will lead to big code files 2. static-inner class would be difficult to handle with mockito. 3. Each of the new class are encapsulating some logic and are not just object description. Would be insightful to know your thoughts on this. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833465#comment-17833465 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2034030038 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 56s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 4s | | trunk passed | | +1 :green_heart: | compile | 0m 38s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 33s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 41s | | trunk passed | | +1 :green_heart: | javadoc | 0m 40s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 37s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 33m 58s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/44/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 11 new + 18 unchanged - 0 fixed = 29 total (was 18) | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 25s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 27s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 32s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | | The patch does not generate ASF License warnings. | | | | 128m 54s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/44/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux ffd477d5591e 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 6ed1dc3a4832a00f0c77a2cbe7b7338fa450007f | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/44/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833458#comment-17833458 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1549304525 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/kac/KeepAliveCache.java: ## @@ -0,0 +1,345 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services.kac; Review Comment: remvoved kac package. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833457#comment-17833457 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2033984569 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 46m 20s | | trunk passed | | +1 :green_heart: | compile | 0m 38s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 34s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 41s | | trunk passed | | +1 :green_heart: | javadoc | 0m 40s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 48s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 9s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 28s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 28s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 20s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/43/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 11 new + 18 unchanged - 0 fixed = 29 total (was 18) | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 5s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 36s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 26s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 131m 21s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/43/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 6641b7a5fc76 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 6cd01b500b900ed7aec236fd9a8091f35f76e810 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/43/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833452#comment-17833452 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1549232373 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AbfsConfiguration.java: ## @@ -363,6 +364,10 @@ public class AbfsConfiguration{ FS_AZURE_ABFS_ENABLE_CHECKSUM_VALIDATION, DefaultValue = DEFAULT_ENABLE_ABFS_CHECKSUM_VALIDATION) private boolean isChecksumValidationEnabled; + @IntegerConfigurationValidatorAnnotation(ConfigurationKey = + FS_AZURE_APACHE_HTTP_CLIENT_MAX_IO_EXCEPTION_RETRIES, DefaultValue = DEFAULT_APACHE_HTTP_CLIENT_MAX_IO_EXCEPTION_RETRIES) Review Comment: 3 retries would happen relatively quickely (in case of exponential retry), and it is enough number that a genuine issue can get resolved. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833451#comment-17833451 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1549231346 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AbfsConfiguration.java: ## @@ -842,6 +847,17 @@ public DelegatingSSLSocketFactory.SSLChannelMode getPreferredSSLFactoryOption() return getEnum(FS_AZURE_SSL_CHANNEL_MODE_KEY, DEFAULT_FS_AZURE_SSL_CHANNEL_MODE); } + /** + * @return Config to select netlib for server communication. + */ + public HttpOperationType getPreferredHttpOperationType() { +return getEnum(FS_AZURE_NETWORKING_LIBRARY, DEFAULT_NETWORKING_LIBRARY); Review Comment: Since, its more desirable client, Apache shall be the default. However, there are fallback mecahnism in place, which would make the process fallback to JDK, in case the new library is facing some issue. Current mechanism is : any request on abfsRestOperation which fails more than 3 times (configurable), it will make the whole process fallback to JDK. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833417#comment-17833417 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2033747862 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 34s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 3s | | trunk passed | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 33s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 41s | | trunk passed | | +1 :green_heart: | javadoc | 0m 38s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 34s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 5s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 42s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 3s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 29s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 20s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/42/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 11 new + 18 unchanged - 0 fixed = 29 total (was 18) | | +1 :green_heart: | mvnsite | 0m 30s | | the patch passed | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 5s | | the patch passed | | +1 :green_heart: | shadedclient | 34m 24s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 22s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 33s | | The patch does not generate ASF License warnings. | | | | 129m 20s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/42/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 9d601761aec1 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / ab9237dff26350968b66c9067073a47089949cc1 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/42/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833413#comment-17833413 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1549007840 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsAHCHttpOperation.java: ## @@ -0,0 +1,423 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.net.URISyntaxException; +import java.net.URL; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.fs.PathIOException; +import org.apache.hadoop.fs.azurebfs.AbfsConfiguration; +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.exceptions.AbfsApacheHttpExpect100Exception; +import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; +import org.apache.http.Header; +import org.apache.http.HttpEntity; +import org.apache.http.HttpResponse; +import org.apache.http.client.methods.CloseableHttpResponse; +import org.apache.http.client.methods.HttpDelete; +import org.apache.http.client.methods.HttpEntityEnclosingRequestBase; +import org.apache.http.client.methods.HttpGet; +import org.apache.http.client.methods.HttpHead; +import org.apache.http.client.methods.HttpPatch; +import org.apache.http.client.methods.HttpPost; +import org.apache.http.client.methods.HttpPut; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.entity.ByteArrayEntity; +import org.apache.http.util.EntityUtils; + +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.APACHE_IMPL; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EMPTY_STRING; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_DELETE; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_GET; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_HEAD; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PATCH; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_POST; +import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HTTP_METHOD_PUT; +import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID; +import static org.apache.http.entity.ContentType.TEXT_PLAIN; + +/** + * Implementation of {@link HttpOperation} for orchestrating server calls using + * Apache Http Client. + */ +public class AbfsAHCHttpOperation extends HttpOperation { + + private static final Logger LOG = LoggerFactory.getLogger( + AbfsAHCHttpOperation.class); + + private static volatile AbfsApacheHttpClient ABFS_APACHE_HTTP_CLIENT; + + private HttpRequestBase httpRequestBase; + + private HttpResponse httpResponse; + + private AbfsManagedHttpContext abfsHttpClientContext; + + private final AbfsRestOperationType abfsRestOperationType; + + private boolean connectionDisconnectedOnError = false; + + private final boolean isPayloadRequest; + + private List requestHeaders; + + private AbfsAHCHttpOperation(final URL url, Review Comment: Taken, passing read and connect timeout in constructor. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Sax
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833377#comment-17833377 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1548855987 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/HttpOperation.java: ## @@ -0,0 +1,510 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.List; +import java.util.Map; + +import com.fasterxml.jackson.core.JsonFactory; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.core.JsonToken; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.slf4j.Logger; + +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable; +import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema; +import org.apache.hadoop.fs.azurebfs.utils.UriUtils; + +/** + * Base Http operation class for orchestrating server IO calls. Child classes would + * define the certain orchestration implementation on the basis of network library used. + * + * For JDK netlib usage, the child class would be {@link AbfsHttpOperation}. + * For ApacheHttpClient netlib usage, the child class would be {@link AbfsAHCHttpOperation}. + * + */ +public abstract class HttpOperation implements AbfsPerfLoggable { + + private final Logger log; + + private static final int CLEAN_UP_BUFFER_SIZE = 64 * 1024; + + private static final int ONE_THOUSAND = 1000; + + private static final int ONE_MILLION = ONE_THOUSAND * ONE_THOUSAND; + + private String method; + + private URL url; + + private String maskedUrl; + + private String maskedEncodedUrl; + + private int statusCode; + + private String statusDescription; + + private String storageErrorCode = ""; + + private String storageErrorMessage = ""; + + private String requestId = ""; + + private String expectedAppendPos = ""; + + private ListResultSchema listResultSchema = null; + + // metrics + private int bytesSent; + + private int expectedBytesToBeSent; + + private long bytesReceived; + + private long connectionTimeMs; + + private long sendRequestTimeMs; + + private long recvResponseTimeMs; + + private boolean shouldMask = false; + + public HttpOperation(Logger logger, + final URL url, + final String method, + final int httpStatus) { +this.log = logger; +this.url = url; +this.method = method; +this.statusCode = httpStatus; + } + + public HttpOperation(final Logger log, final URL url, final String method) { +this.log = log; +this.url = url; +this.method = method; + } + + public String getMethod() { +return method; + } + + public String getHost() { +return url.getHost(); + } + + public int getStatusCode() { +return statusCode; + } + + public String getStatusDescription() { +return statusDescription; + } + + public String getStorageErrorCode() { +return storageErrorCode; + } + + public String getStorageErrorMessage() { +return storageErrorMessage; + } + + public abstract String getClientRequestId(); + + public String getExpectedAppendPos() { +return expectedAppendPos; + } + + public String getRequestId() { +return requestId; + } + + public void setMaskForSAS() { +shouldMask = true; + } + + public int getBytesSent() { +return bytesSent; + } + + public int getExpectedBytesToBeSent() { +return expectedBytesToBeSent; + } + + public long getBytesReceived() { +return bytesReceived; + } + + public URL getUrl() { +return url; + } + + public ListResultSchema getListResultSchema() { +return listResultSchema; + } + + public abstract String getResponseHeader(String httpHeader); + + void setExpectedBytesToBeSent(int e
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833378#comment-17833378 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1548855671 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/HttpOperation.java: ## @@ -0,0 +1,510 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.List; +import java.util.Map; + +import com.fasterxml.jackson.core.JsonFactory; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.core.JsonToken; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.slf4j.Logger; + +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable; +import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema; +import org.apache.hadoop.fs.azurebfs.utils.UriUtils; + +/** + * Base Http operation class for orchestrating server IO calls. Child classes would + * define the certain orchestration implementation on the basis of network library used. + * + * For JDK netlib usage, the child class would be {@link AbfsHttpOperation}. + * For ApacheHttpClient netlib usage, the child class would be {@link AbfsAHCHttpOperation}. + * + */ +public abstract class HttpOperation implements AbfsPerfLoggable { + + private final Logger log; + + private static final int CLEAN_UP_BUFFER_SIZE = 64 * 1024; + + private static final int ONE_THOUSAND = 1000; + + private static final int ONE_MILLION = ONE_THOUSAND * ONE_THOUSAND; + + private String method; + + private URL url; + + private String maskedUrl; + + private String maskedEncodedUrl; + + private int statusCode; + + private String statusDescription; + + private String storageErrorCode = ""; + + private String storageErrorMessage = ""; + + private String requestId = ""; + + private String expectedAppendPos = ""; + + private ListResultSchema listResultSchema = null; + + // metrics + private int bytesSent; + + private int expectedBytesToBeSent; + + private long bytesReceived; + + private long connectionTimeMs; + + private long sendRequestTimeMs; + + private long recvResponseTimeMs; + + private boolean shouldMask = false; + + public HttpOperation(Logger logger, + final URL url, + final String method, + final int httpStatus) { +this.log = logger; +this.url = url; +this.method = method; +this.statusCode = httpStatus; + } + + public HttpOperation(final Logger log, final URL url, final String method) { +this.log = log; +this.url = url; +this.method = method; + } + + public String getMethod() { +return method; + } + + public String getHost() { +return url.getHost(); + } + + public int getStatusCode() { +return statusCode; + } + + public String getStatusDescription() { +return statusDescription; + } + + public String getStorageErrorCode() { +return storageErrorCode; + } + + public String getStorageErrorMessage() { +return storageErrorMessage; + } + + public abstract String getClientRequestId(); + + public String getExpectedAppendPos() { +return expectedAppendPos; + } + + public String getRequestId() { +return requestId; + } + + public void setMaskForSAS() { +shouldMask = true; + } + + public int getBytesSent() { +return bytesSent; + } + + public int getExpectedBytesToBeSent() { +return expectedBytesToBeSent; + } + + public long getBytesReceived() { +return bytesReceived; + } + + public URL getUrl() { +return url; + } + + public ListResultSchema getListResultSchema() { +return listResultSchema; + } + + public abstract String getResponseHeader(String httpHeader); + + void setExpectedBytesToBeSent(int e
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833375#comment-17833375 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1548855671 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/HttpOperation.java: ## @@ -0,0 +1,510 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.List; +import java.util.Map; + +import com.fasterxml.jackson.core.JsonFactory; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.core.JsonToken; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.slf4j.Logger; + +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable; +import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema; +import org.apache.hadoop.fs.azurebfs.utils.UriUtils; + +/** + * Base Http operation class for orchestrating server IO calls. Child classes would + * define the certain orchestration implementation on the basis of network library used. + * + * For JDK netlib usage, the child class would be {@link AbfsHttpOperation}. + * For ApacheHttpClient netlib usage, the child class would be {@link AbfsAHCHttpOperation}. + * + */ +public abstract class HttpOperation implements AbfsPerfLoggable { + + private final Logger log; + + private static final int CLEAN_UP_BUFFER_SIZE = 64 * 1024; + + private static final int ONE_THOUSAND = 1000; + + private static final int ONE_MILLION = ONE_THOUSAND * ONE_THOUSAND; + + private String method; + + private URL url; + + private String maskedUrl; + + private String maskedEncodedUrl; + + private int statusCode; + + private String statusDescription; + + private String storageErrorCode = ""; + + private String storageErrorMessage = ""; + + private String requestId = ""; + + private String expectedAppendPos = ""; + + private ListResultSchema listResultSchema = null; + + // metrics + private int bytesSent; + + private int expectedBytesToBeSent; + + private long bytesReceived; + + private long connectionTimeMs; + + private long sendRequestTimeMs; + + private long recvResponseTimeMs; + + private boolean shouldMask = false; + + public HttpOperation(Logger logger, + final URL url, + final String method, + final int httpStatus) { +this.log = logger; +this.url = url; +this.method = method; +this.statusCode = httpStatus; + } + + public HttpOperation(final Logger log, final URL url, final String method) { +this.log = log; +this.url = url; +this.method = method; + } + + public String getMethod() { +return method; + } + + public String getHost() { +return url.getHost(); + } + + public int getStatusCode() { +return statusCode; + } + + public String getStatusDescription() { +return statusDescription; + } + + public String getStorageErrorCode() { +return storageErrorCode; + } + + public String getStorageErrorMessage() { +return storageErrorMessage; + } + + public abstract String getClientRequestId(); + + public String getExpectedAppendPos() { +return expectedAppendPos; + } + + public String getRequestId() { +return requestId; + } + + public void setMaskForSAS() { +shouldMask = true; + } + + public int getBytesSent() { +return bytesSent; + } + + public int getExpectedBytesToBeSent() { +return expectedBytesToBeSent; + } + + public long getBytesReceived() { +return bytesReceived; + } + + public URL getUrl() { +return url; + } + + public ListResultSchema getListResultSchema() { +return listResultSchema; + } + + public abstract String getResponseHeader(String httpHeader); + + void setExpectedBytesToBeSent(int e
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833376#comment-17833376 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1548855987 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/HttpOperation.java: ## @@ -0,0 +1,510 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; +import java.io.InputStream; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.List; +import java.util.Map; + +import com.fasterxml.jackson.core.JsonFactory; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.core.JsonToken; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.slf4j.Logger; + +import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; +import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; +import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable; +import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema; +import org.apache.hadoop.fs.azurebfs.utils.UriUtils; + +/** + * Base Http operation class for orchestrating server IO calls. Child classes would + * define the certain orchestration implementation on the basis of network library used. + * + * For JDK netlib usage, the child class would be {@link AbfsHttpOperation}. + * For ApacheHttpClient netlib usage, the child class would be {@link AbfsAHCHttpOperation}. + * + */ +public abstract class HttpOperation implements AbfsPerfLoggable { + + private final Logger log; + + private static final int CLEAN_UP_BUFFER_SIZE = 64 * 1024; + + private static final int ONE_THOUSAND = 1000; + + private static final int ONE_MILLION = ONE_THOUSAND * ONE_THOUSAND; + + private String method; + + private URL url; + + private String maskedUrl; + + private String maskedEncodedUrl; + + private int statusCode; + + private String statusDescription; + + private String storageErrorCode = ""; + + private String storageErrorMessage = ""; + + private String requestId = ""; + + private String expectedAppendPos = ""; + + private ListResultSchema listResultSchema = null; + + // metrics + private int bytesSent; + + private int expectedBytesToBeSent; + + private long bytesReceived; + + private long connectionTimeMs; + + private long sendRequestTimeMs; + + private long recvResponseTimeMs; + + private boolean shouldMask = false; + + public HttpOperation(Logger logger, + final URL url, + final String method, + final int httpStatus) { +this.log = logger; +this.url = url; +this.method = method; +this.statusCode = httpStatus; + } + + public HttpOperation(final Logger log, final URL url, final String method) { +this.log = log; +this.url = url; +this.method = method; + } + + public String getMethod() { +return method; + } + + public String getHost() { +return url.getHost(); + } + + public int getStatusCode() { +return statusCode; + } + + public String getStatusDescription() { +return statusDescription; + } + + public String getStorageErrorCode() { +return storageErrorCode; + } + + public String getStorageErrorMessage() { +return storageErrorMessage; + } + + public abstract String getClientRequestId(); + + public String getExpectedAppendPos() { +return expectedAppendPos; + } + + public String getRequestId() { +return requestId; + } + + public void setMaskForSAS() { +shouldMask = true; + } + + public int getBytesSent() { +return bytesSent; + } + + public int getExpectedBytesToBeSent() { +return expectedBytesToBeSent; + } + + public long getBytesReceived() { +return bytesReceived; + } + + public URL getUrl() { +return url; + } + + public ListResultSchema getListResultSchema() { +return listResultSchema; + } + + public abstract String getResponseHeader(String httpHeader); + + void setExpectedBytesToBeSent(int e
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833372#comment-17833372 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1548854300 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsApacheHttpClient.java: ## @@ -0,0 +1,93 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; + +import org.apache.hadoop.fs.azurebfs.AbfsConfiguration; +import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; +import org.apache.http.HttpResponse; +import org.apache.http.client.config.RequestConfig; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.config.Registry; +import org.apache.http.config.RegistryBuilder; +import org.apache.http.conn.socket.ConnectionSocketFactory; +import org.apache.http.conn.socket.PlainConnectionSocketFactory; +import org.apache.http.conn.ssl.SSLConnectionSocketFactory; +import org.apache.http.impl.client.CloseableHttpClient; +import org.apache.http.impl.client.HttpClientBuilder; +import org.apache.http.impl.client.HttpClients; + +import static org.apache.http.conn.ssl.SSLConnectionSocketFactory.getDefaultHostnameVerifier; + +public class AbfsApacheHttpClient { + private final CloseableHttpClient httpClient; + + private final AbfsConfiguration abfsConfiguration; + + public AbfsApacheHttpClient(DelegatingSSLSocketFactory delegatingSSLSocketFactory, + final AbfsConfiguration abfsConfiguration) { +this.abfsConfiguration = abfsConfiguration; +final AbfsConnectionManager connMgr = new AbfsConnectionManager( +createSocketFactoryRegistry( +new SSLConnectionSocketFactory(delegatingSSLSocketFactory, +getDefaultHostnameVerifier())), +new org.apache.hadoop.fs.azurebfs.services.AbfsConnFactory()); +final HttpClientBuilder builder = HttpClients.custom(); +builder.setConnectionManager(connMgr) +.setRequestExecutor(new AbfsManagedHttpRequestExecutor( +abfsConfiguration.getHttpReadTimeout())) +.disableContentCompression() +.disableRedirectHandling() +.disableAutomaticRetries() +.setUserAgent( +""); // SDK will set the user agent header in the pipeline. Don't let Apache waste time +httpClient = builder.build(); + } + + public void close() throws IOException { +if (httpClient != null) { + httpClient.close(); +} + } + + public HttpResponse execute(HttpRequestBase httpRequest, + final AbfsManagedHttpContext abfsHttpClientContext) throws IOException { +RequestConfig.Builder requestConfigBuilder = RequestConfig +.custom() +.setConnectTimeout(abfsConfiguration.getHttpConnectionTimeout()) +.setSocketTimeout(abfsConfiguration.getHttpReadTimeout()); +httpRequest.setConfig(requestConfigBuilder.build()); +return httpClient.execute(httpRequest, abfsHttpClientContext); + } + + + private static Registry createSocketFactoryRegistry( + ConnectionSocketFactory sslSocketFactory) { +if (sslSocketFactory == null) { + return RegistryBuilder.create() + .register("http", PlainConnectionSocketFactory.getSocketFactory()) Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > O
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833374#comment-17833374 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1548854470 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java: ## @@ -1512,6 +1515,7 @@ String initializeUserAgent(final AbfsConfiguration abfsConfiguration, sb.append(HUNDRED_CONTINUE); sb.append(SEMICOLON); } +sb.append(" ").append(abfsConfiguration.getPreferredHttpOperationType()).append(";"); Review Comment: taken. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833371#comment-17833371 ] ASF GitHub Bot commented on HADOOP-19120: - saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1548854200 ## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsApacheHttpClient.java: ## @@ -0,0 +1,93 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.services; + +import java.io.IOException; + +import org.apache.hadoop.fs.azurebfs.AbfsConfiguration; +import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; +import org.apache.http.HttpResponse; +import org.apache.http.client.config.RequestConfig; +import org.apache.http.client.methods.HttpRequestBase; +import org.apache.http.config.Registry; +import org.apache.http.config.RegistryBuilder; +import org.apache.http.conn.socket.ConnectionSocketFactory; +import org.apache.http.conn.socket.PlainConnectionSocketFactory; +import org.apache.http.conn.ssl.SSLConnectionSocketFactory; +import org.apache.http.impl.client.CloseableHttpClient; +import org.apache.http.impl.client.HttpClientBuilder; +import org.apache.http.impl.client.HttpClients; + +import static org.apache.http.conn.ssl.SSLConnectionSocketFactory.getDefaultHostnameVerifier; + +public class AbfsApacheHttpClient { + private final CloseableHttpClient httpClient; + + private final AbfsConfiguration abfsConfiguration; + + public AbfsApacheHttpClient(DelegatingSSLSocketFactory delegatingSSLSocketFactory, + final AbfsConfiguration abfsConfiguration) { +this.abfsConfiguration = abfsConfiguration; +final AbfsConnectionManager connMgr = new AbfsConnectionManager( +createSocketFactoryRegistry( +new SSLConnectionSocketFactory(delegatingSSLSocketFactory, +getDefaultHostnameVerifier())), +new org.apache.hadoop.fs.azurebfs.services.AbfsConnFactory()); +final HttpClientBuilder builder = HttpClients.custom(); +builder.setConnectionManager(connMgr) +.setRequestExecutor(new AbfsManagedHttpRequestExecutor( +abfsConfiguration.getHttpReadTimeout())) +.disableContentCompression() +.disableRedirectHandling() +.disableAutomaticRetries() +.setUserAgent( +""); // SDK will set the user agent header in the pipeline. Don't let Apache waste time Review Comment: Have fixed the comment. > [ABFS]: ApacheHttpClient adaptation as network library > -- > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.5.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833194#comment-17833194 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2032001807 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 58s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 45m 21s | | trunk passed | | +1 :green_heart: | compile | 0m 38s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 33s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 41s | | trunk passed | | +1 :green_heart: | javadoc | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 27s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 33m 48s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 20s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/41/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 9 new + 18 unchanged - 0 fixed = 27 total (was 18) | | +1 :green_heart: | mvnsite | 0m 30s | | the patch passed | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 5s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 14s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 32s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 129m 55s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/41/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux f85836cea0f9 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 241bce05039156d7e55b4c5c6b32acca4c9656ef | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/41/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833187#comment-17833187 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2031978575 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 21 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 25s | | trunk passed | | +1 :green_heart: | compile | 0m 38s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 34s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 41s | | trunk passed | | +1 :green_heart: | javadoc | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 37s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 5s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 55s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 17s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 31s | | the patch passed | | +1 :green_heart: | compile | 0m 31s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 31s | | the patch passed | | +1 :green_heart: | compile | 0m 28s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 28s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/40/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 9 new + 18 unchanged - 0 fixed = 27 total (was 18) | | +1 :green_heart: | mvnsite | 0m 30s | | the patch passed | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 5s | | the patch passed | | +1 :green_heart: | shadedclient | 34m 6s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 27s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 129m 51s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/40/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 4a364f718a9b 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 266caf2fd0d92a13de7301bdf361810cd5ea82ca | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/40/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833183#comment-17833183 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2031960607 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 34s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 22 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 45m 10s | | trunk passed | | +1 :green_heart: | compile | 0m 35s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 32s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 30s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 39s | | trunk passed | | +1 :green_heart: | javadoc | 0m 36s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 32s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 1s | | trunk passed | | +1 :green_heart: | shadedclient | 34m 49s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 35m 10s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 20s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/39/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 9 new + 18 unchanged - 0 fixed = 27 total (was 18) | | +1 :green_heart: | mvnsite | 0m 30s | | the patch passed | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 26s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 26s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 130m 47s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/39/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux e98b32386339 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 99b2f496cbf3df06361f699569c24b904b676b44 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/39/testReport/ |
[jira] [Commented] (HADOOP-19120) [ABFS]: ApacheHttpClient adaptation as network library
[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17833157#comment-17833157 ] ASF GitHub Bot commented on HADOOP-19120: - hadoop-yetus commented on PR #6633: URL: https://github.com/apache/hadoop/pull/6633#issuecomment-2031854632 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 32s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 22 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 4s | | trunk passed | | +1 :green_heart: | compile | 0m 41s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 36s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 34s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 40s | | trunk passed | | +1 :green_heart: | javadoc | 0m 36s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 34m 6s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 25s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 28s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 24s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 24s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 18s | [/results-checkstyle-hadoop-tools_hadoop-azure.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/38/artifact/out/results-checkstyle-hadoop-tools_hadoop-azure.txt) | hadoop-tools/hadoop-azure: The patch generated 9 new + 18 unchanged - 0 fixed = 27 total (was 18) | | +1 :green_heart: | mvnsite | 0m 28s | | the patch passed | | +1 :green_heart: | javadoc | 0m 25s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 25s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 4s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 54s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 24s | | hadoop-azure in the patch passed. | | +1 :green_heart: | asflicense | 0m 36s | | The patch does not generate ASF License warnings. | | | | 128m 59s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/38/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6633 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 2521b3c75554 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0eacd161d0104391d1c184131d343d3b64821d53 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6633/38/testReport/ |