[ https://issues.apache.org/jira/browse/HADOOP-19120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854377#comment-17854377 ]
ASF GitHub Bot commented on HADOOP-19120: ----------------------------------------- saxenapranav commented on code in PR #6633: URL: https://github.com/apache/hadoop/pull/6633#discussion_r1636216799 ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsHttpOperation.java: ########## @@ -20,57 +20,51 @@ import java.io.IOException; import java.io.InputStream; -import java.io.OutputStream; import java.net.HttpURLConnection; -import java.net.ProtocolException; import java.net.URL; +import java.util.ArrayList; import java.util.List; - -import javax.net.ssl.HttpsURLConnection; -import javax.net.ssl.SSLSocketFactory; - -import org.apache.hadoop.classification.VisibleForTesting; -import org.apache.hadoop.fs.azurebfs.utils.UriUtils; -import org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory; +import java.util.Map; import com.fasterxml.jackson.core.JsonFactory; import com.fasterxml.jackson.core.JsonParser; import com.fasterxml.jackson.core.JsonToken; import com.fasterxml.jackson.databind.ObjectMapper; import org.slf4j.Logger; -import org.slf4j.LoggerFactory; import org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants; import org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations; import org.apache.hadoop.fs.azurebfs.contracts.services.AbfsPerfLoggable; import org.apache.hadoop.fs.azurebfs.contracts.services.ListResultSchema; - -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.EXPECT_100_JDK_ERROR; -import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.HUNDRED_CONTINUE; -import static org.apache.hadoop.fs.azurebfs.constants.HttpHeaderConfigurations.EXPECT; +import org.apache.hadoop.fs.azurebfs.utils.UriUtils; /** - * Represents an HTTP operation. + * Base Http operation class for orchestrating server IO calls. Child classes would + * define the certain orchestration implementation on the basis of network library used. + * <p> + * For JDK netlib usage, the child class would be {@link AbfsJdkHttpOperation}. <br> + * For ApacheHttpClient netlib usage, the child class would be {@link AbfsAHCHttpOperation}. + * </p> */ -public class AbfsHttpOperation implements AbfsPerfLoggable { - private static final Logger LOG = LoggerFactory.getLogger(AbfsHttpOperation.class); +public abstract class AbfsHttpOperation implements AbfsPerfLoggable { Review Comment: Added. > [ABFS]: ApacheHttpClient adaptation as network library > ------------------------------------------------------ > > Key: HADOOP-19120 > URL: https://issues.apache.org/jira/browse/HADOOP-19120 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure > Affects Versions: 3.5.0 > Reporter: Pranav Saxena > Assignee: Pranav Saxena > Priority: Major > Labels: pull-request-available > > Apache HttpClient is more feature-rich and flexible and gives application > more granular control over networking parameter. > ABFS currently relies on the JDK-net library. This library is managed by > OpenJDK and has no performance problem. However, it limits the application's > control over networking, and there are very few APIs and hooks exposed that > the application can use to get metrics, choose which and when a connection > should be reused. ApacheHttpClient will give important hooks to fetch > important metrics and control networking parameters. > A custom implementation of connection-pool is used. The implementation is > adapted from the JDK8 connection pooling. Reasons for doing it: > 1. PoolingHttpClientConnectionManager heuristic caches all the reusable > connections it has created. JDK's implementation only caches limited number > of connections. The limit is given by JVM system property > "http.maxConnections". If there is no system-property, it defaults to 5. > Connection-establishment latency increased with all the connections were > cached. Hence, adapting the pooling heuristic of JDK netlib, > 2. In PoolingHttpClientConnectionManager, it expects the application to > provide `setMaxPerRoute` and `setMaxTotal`, which the implementation uses as > the total number of connections it can create. For application using ABFS, it > is not feasible to provide a value in the initialisation of the > connectionManager. JDK's implementation has no cap on the number of > connections it can have opened on a moment. Hence, adapting the pooling > heuristic of JDK netlib, -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org