jackye1995 commented on code in PR #5902:
URL: https://github.com/apache/iceberg/pull/5902#discussion_r990813762
##########
docs/aws.md:
##########
@@ -573,6 +573,56 @@ spark-sql --packages
org.apache.iceberg:iceberg-spark3-runtime:{{% icebergVersio
--conf
spark.sql.catalog.my_catalog.client.assume-role.region=ap-northeast-1
```
+### HTTP Client Configurations
+AWS clients support two types of HTTP Client, [URL Connection HTTP
Client](https://mvnrepository.com/artifact/software.amazon.awssdk/url-connection-client)
+and [Apache HTTP
Client](https://mvnrepository.com/artifact/software.amazon.awssdk/apache-client).
+By default, AWS clients use **URL Connection** HTTP Client to communicate with
the service.
+This HTTP client optimizes for minimum dependencies and startup latency but
support less functionality than other implementations.
+In contrast, Apache HTTP Client supports more functionalities and more
customized settings, such as expect-continue handshake and TCP KeepAlive, at
cost of extra dependency and additional startup latency.
+
+For more details of configuration, see sections [URL Connection HTTP Client
Configurations](#url-connection-http-client-configurations) and [Apache HTTP
Client Configurations](#apache-http-client-configurations).
+
+Configure the following property to set the type of HTTP client:
+
+| Property | Default | Description
|
+|------------------|---------------|------------------------------------------------------------------------------------------------------------|
+| http-client.type | urlconnection | Types of HTTP Client. <br/>
`urlconnection`: URL Connection HTTP Client <br/> `apache`: Apache HTTP Client |
+
+#### URL Connection HTTP Client Configurations
+
+URL Connection HTTP Client has the following configurable properties:
+
+| Property | Default | Description
|
+|-------------------------------------------------|---------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| http-client.urlconnection.socket-timeout-ms | null | An optional
[socket
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/urlconnection/UrlConnectionHttpClient.Builder.html#socketTimeout(java.time.Duration))
in milliseconds |
+| http-client.urlconnection.connection-timeout-ms | null | An optional
[connection
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/urlconnection/UrlConnectionHttpClient.Builder.html#connectionTimeout(java.time.Duration))
in milliseconds |
+
+Users can use catalog properties to override the defaults. For example, to
configure the socket timeout for URL Connection HTTP Client when starting a
spark shell, one can add:
+```shell
+--conf
spark.sql.catalog.my_catalog.http-client.urlconnection.socket-timeout-ms=80
+```
+
+#### Apache HTTP Client Configurations
+
+Apache HTTP Client has the following configurable properties:
+
+| Property | Default
| Description
|
+|-------------------------------------------------------|---------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| http-client.apache.socket-timeout-ms | null
| An optional [socket
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#socketTimeout(java.time.Duration))
in milliseconds |
+| http-client.apache.connection-timeout-ms | null
| An optional [connection
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#connectionTimeout(java.time.Duration))
in milliseconds |
+| http-client.apache.connection-acquisition-timeout-ms | null
| An optional [connection acquisition
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#connectionAcquisitionTimeout(java.time.Duration))
in milliseconds |
+| http-client.apache.connection-max-idle-time-ms | null
| An optional [connection max idle
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#connectionMaxIdleTime(java.time.Duration))
in milliseconds |
+| http-client.apache.connection-time-to-live-ms | null
| An optional [connection time to
live](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#connectionTimeToLive(java.time.Duration))
in milliseconds |
+| http-client.apache.expect-continue-enabled | null, disabled by
default | An optional `true/false` setting that decide whether to enable
[expect
continue](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#expectContinueEnabled(java.lang.Boolean))
|
+| http-client.apache.max-connections | null
| An optional [max
connections](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#maxConnections(java.lang.Integer))
in integer |
+| http-client.apache.tcp-keep-alive-enabled | null, disabled by
default | An optional `true/false` setting that decide whether to enable [tcp
keep
alive](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#tcpKeepAlive(java.lang.Boolean))
|
+| http-client.apache.use-idle-connection-reaper-enabled | null, enabled by
default | An optional `true/false` setting that decide whether to [use idle
connection
reaper](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#useIdleConnectionReaper(java.lang.Boolean))
|
+
+Users can use catalog properties to override the defaults. For example, to
configure the max connections for Apache HTTP Client when starting a spark
shell, one can add:
+```shell
+--conf spark.sql.catalog.my_catalog.http-client.apache.max-connections=5
+```
Review Comment:
We probably want one more PR to update all the Spark examples in `aws.md`,
so for this one we can keep it as is.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]