jackye1995 commented on code in PR #5902:
URL: https://github.com/apache/iceberg/pull/5902#discussion_r990813762


##########
docs/aws.md:
##########
@@ -573,6 +573,56 @@ spark-sql --packages 
org.apache.iceberg:iceberg-spark3-runtime:{{% icebergVersio
     --conf 
spark.sql.catalog.my_catalog.client.assume-role.region=ap-northeast-1
 ```
 
+### HTTP Client Configurations
+AWS clients support two types of HTTP Client, [URL Connection HTTP 
Client](https://mvnrepository.com/artifact/software.amazon.awssdk/url-connection-client)
 
+and [Apache HTTP 
Client](https://mvnrepository.com/artifact/software.amazon.awssdk/apache-client).
+By default, AWS clients use **URL Connection** HTTP Client to communicate with 
the service. 
+This HTTP client optimizes for minimum dependencies and startup latency but 
support less functionality than other implementations. 
+In contrast, Apache HTTP Client supports more functionalities and more 
customized settings, such as expect-continue handshake and TCP KeepAlive, at 
cost of extra dependency and additional startup latency. 
+
+For more details of configuration, see sections [URL Connection HTTP Client 
Configurations](#url-connection-http-client-configurations) and [Apache HTTP 
Client Configurations](#apache-http-client-configurations).
+
+Configure the following property to set the type of HTTP client:
+
+| Property         | Default       | Description                               
                                                                 |
+|------------------|---------------|------------------------------------------------------------------------------------------------------------|
+| http-client.type | urlconnection | Types of HTTP Client. <br/> 
`urlconnection`: URL Connection HTTP Client <br/> `apache`: Apache HTTP Client |
+
+#### URL Connection HTTP Client Configurations
+
+URL Connection HTTP Client has the following configurable properties:
+
+| Property                                        | Default | Description      
                                                                                
                                                                                
                                |
+|-------------------------------------------------|---------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| http-client.urlconnection.socket-timeout-ms     | null    | An optional 
[socket 
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/urlconnection/UrlConnectionHttpClient.Builder.html#socketTimeout(java.time.Duration))
 in milliseconds         |
+| http-client.urlconnection.connection-timeout-ms | null    | An optional 
[connection 
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/urlconnection/UrlConnectionHttpClient.Builder.html#connectionTimeout(java.time.Duration))
 in milliseconds |
+
+Users can use catalog properties to override the defaults. For example, to 
configure the socket timeout for URL Connection HTTP Client when starting a 
spark shell, one can add:
+```shell
+--conf 
spark.sql.catalog.my_catalog.http-client.urlconnection.socket-timeout-ms=80
+```
+
+#### Apache HTTP Client Configurations
+
+Apache HTTP Client has the following configurable properties:
+
+| Property                                              | Default              
     | Description                                                              
                                                                                
                                                                                
   |
+|-------------------------------------------------------|---------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| http-client.apache.socket-timeout-ms                  | null                 
     | An optional [socket 
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#socketTimeout(java.time.Duration))
 in milliseconds                                                  |
+| http-client.apache.connection-timeout-ms              | null                 
     | An optional [connection 
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#connectionTimeout(java.time.Duration))
 in milliseconds                                          |
+| http-client.apache.connection-acquisition-timeout-ms  | null                 
     | An optional [connection acquisition 
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#connectionAcquisitionTimeout(java.time.Duration))
 in milliseconds                   |
+| http-client.apache.connection-max-idle-time-ms        | null                 
     | An optional [connection max idle 
timeout](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#connectionMaxIdleTime(java.time.Duration))
 in milliseconds                             |
+| http-client.apache.connection-time-to-live-ms         | null                 
     | An optional [connection time to 
live](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#connectionTimeToLive(java.time.Duration))
 in milliseconds                                  |
+| http-client.apache.expect-continue-enabled            | null, disabled by 
default | An optional `true/false` setting that decide whether to enable 
[expect 
continue](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#expectContinueEnabled(java.lang.Boolean))
       |
+| http-client.apache.max-connections                    | null                 
     | An optional [max 
connections](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#maxConnections(java.lang.Integer))
  in integer                                                     |
+| http-client.apache.tcp-keep-alive-enabled             | null, disabled by 
default | An optional `true/false` setting that decide whether to enable [tcp 
keep 
alive](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#tcpKeepAlive(java.lang.Boolean))
                 |
+| http-client.apache.use-idle-connection-reaper-enabled | null, enabled by 
default  | An optional `true/false` setting that decide whether to [use idle 
connection 
reaper](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/apache/ApacheHttpClient.Builder.html#useIdleConnectionReaper(java.lang.Boolean))
 |
+
+Users can use catalog properties to override the defaults. For example, to 
configure the max connections for Apache HTTP Client when starting a spark 
shell, one can add:
+```shell
+--conf spark.sql.catalog.my_catalog.http-client.apache.max-connections=5
+```

Review Comment:
   We probably want one more PR to update all the Spark examples in `aws.md`, 
so for this one we can keep it as is.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to