absurdfarce commented on code in PR #1902:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1902#discussion_r1434162481


##########
core/src/main/java/com/datastax/oss/driver/internal/core/config/cloud/CloudConfigFactory.java:
##########
@@ -225,22 +228,51 @@ protected TrustManagerFactory createTrustManagerFactory(
   @NonNull
   protected BufferedReader fetchProxyMetadata(
       @NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws 
IOException {
-    try {
-      HttpsURLConnection connection = (HttpsURLConnection) 
metadataServiceUrl.openConnection();
+    HttpsURLConnection connection = null;
+    int attempt = 0;
+    while (attempt < METADATA_RETRY_MAX_ATTEMPTS) {
+      attempt++;
+      connection = (HttpsURLConnection) metadataServiceUrl.openConnection();
       connection.setSSLSocketFactory(sslContext.getSocketFactory());
       connection.setRequestMethod("GET");
       connection.setRequestProperty("host", "localhost");
-      return new BufferedReader(
-          new InputStreamReader(connection.getInputStream(), 
StandardCharsets.UTF_8));
-    } catch (ConnectException e) {
-      throw new IllegalStateException(
-          "Unable to connect to cloud metadata service. Please make sure your 
cluster is not parked or terminated",
-          e);
-    } catch (UnknownHostException e) {
-      throw new IllegalStateException(
-          "Unable to resolve host for cloud metadata service. Please make sure 
your cluster is not terminated",
-          e);
+      // if this is the last attempt, throw
+      // else if the response code is 401, 421, or 5xx, retry
+      // else, throw
+      if (attempt < METADATA_RETRY_MAX_ATTEMPTS
+          && (connection.getResponseCode() == 401
+              || connection.getResponseCode() == 421
+              || connection.getResponseCode() >= 500)) {
+        try {
+          Thread.sleep(METADATA_RETRY_INITIAL_DELAY_MS);
+          continue;
+        } catch (InterruptedException interruptedException) {
+          throw new IOException(
+              "Interrupted while waiting to retry metadata fetch", 
interruptedException);
+        }
+      }
+
+      // last attempt, still 421 or 401
+      if (connection.getResponseCode() == 421 || connection.getResponseCode() 
== 401) {

Review Comment:
   Don't we want to throw an exception of some kind in this case as long as the 
response code isn't a 200?  I'm trying to imagine a situation in which we would 
want to proceed with trying to get an InputStream from the connection if we got 
a 3xx, 4xx or 5xx response.
   
   There are other 2xx responses besides 200 ([the 
spec](https://www.rfc-editor.org/rfc/rfc9110#section-15.3) has some more 
details) but none of those seem to apply in our case.



##########
core/src/main/java/com/datastax/oss/driver/internal/core/config/cloud/CloudConfigFactory.java:
##########
@@ -225,22 +228,51 @@ protected TrustManagerFactory createTrustManagerFactory(
   @NonNull
   protected BufferedReader fetchProxyMetadata(
       @NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws 
IOException {
-    try {
-      HttpsURLConnection connection = (HttpsURLConnection) 
metadataServiceUrl.openConnection();
+    HttpsURLConnection connection = null;
+    int attempt = 0;
+    while (attempt < METADATA_RETRY_MAX_ATTEMPTS) {
+      attempt++;
+      connection = (HttpsURLConnection) metadataServiceUrl.openConnection();
       connection.setSSLSocketFactory(sslContext.getSocketFactory());
       connection.setRequestMethod("GET");
       connection.setRequestProperty("host", "localhost");
-      return new BufferedReader(
-          new InputStreamReader(connection.getInputStream(), 
StandardCharsets.UTF_8));
-    } catch (ConnectException e) {
-      throw new IllegalStateException(
-          "Unable to connect to cloud metadata service. Please make sure your 
cluster is not parked or terminated",
-          e);
-    } catch (UnknownHostException e) {
-      throw new IllegalStateException(
-          "Unable to resolve host for cloud metadata service. Please make sure 
your cluster is not terminated",
-          e);
+      // if this is the last attempt, throw
+      // else if the response code is 401, 421, or 5xx, retry
+      // else, throw
+      if (attempt < METADATA_RETRY_MAX_ATTEMPTS

Review Comment:
   I think you'd be better off handling the attempt >= 
METADATA_RETRY_MAX_ATTEMPTS first.  The benefit with that approach is that 
you're guaranteed that everything afterwards has not yet hit the max number of 
attempts (and is therefore subject to retry):
   
   ```java
   int rc = connection.getResponseCode();
   if (attempt >= METADATA_RETRY_MAX_ATTEMPTS) {
     if (rc == HttpURLConnection.HTTP_OK) {
         try {
           return new BufferedReader(
               new InputStreamReader(connection.getInputStream(), 
StandardCharsets.UTF_8));
         } catch (ConnectException e) {
           throw new IllegalStateException(
               "Unable to connect to cloud metadata service. Please make sure 
your cluster is not parked or terminated",
               e);
         } catch (UnknownHostException e) {
           throw new IllegalStateException(
               "Unable to resolve host for cloud metadata service. Please make 
sure your cluster is not terminated",
               e);
         }
     }
     else {
           // TODO: Might also want to add some logging here (or maybe above) 
about what the observed return code was
           throw new IllegalStateException(
               "Unable to fetch metadata from cloud metadata service. Please 
make sure your cluster is not parked or terminated");
         }
     }
   }
   
   assert attempt < METADATA_RETRY_MAX_ATTEMPTS;
   ```



##########
core/src/main/java/com/datastax/oss/driver/internal/core/config/cloud/CloudConfigFactory.java:
##########
@@ -225,22 +228,51 @@ protected TrustManagerFactory createTrustManagerFactory(
   @NonNull
   protected BufferedReader fetchProxyMetadata(
       @NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws 
IOException {
-    try {
-      HttpsURLConnection connection = (HttpsURLConnection) 
metadataServiceUrl.openConnection();
+    HttpsURLConnection connection = null;
+    int attempt = 0;
+    while (attempt < METADATA_RETRY_MAX_ATTEMPTS) {
+      attempt++;
+      connection = (HttpsURLConnection) metadataServiceUrl.openConnection();
       connection.setSSLSocketFactory(sslContext.getSocketFactory());
       connection.setRequestMethod("GET");
       connection.setRequestProperty("host", "localhost");
-      return new BufferedReader(
-          new InputStreamReader(connection.getInputStream(), 
StandardCharsets.UTF_8));
-    } catch (ConnectException e) {
-      throw new IllegalStateException(
-          "Unable to connect to cloud metadata service. Please make sure your 
cluster is not parked or terminated",
-          e);
-    } catch (UnknownHostException e) {
-      throw new IllegalStateException(
-          "Unable to resolve host for cloud metadata service. Please make sure 
your cluster is not terminated",
-          e);
+      // if this is the last attempt, throw
+      // else if the response code is 401, 421, or 5xx, retry
+      // else, throw
+      if (attempt < METADATA_RETRY_MAX_ATTEMPTS
+          && (connection.getResponseCode() == 401
+              || connection.getResponseCode() == 421
+              || connection.getResponseCode() >= 500)) {

Review Comment:
   This logic should also probably be pulled out into a simple helper method, 
something like this:
   
   ```java
   private boolean shouldRetry(int rc) {
      return connection.getResponseCode() == 401
                 || connection.getResponseCode() == 421
                 || connection.getResponseCode() >= 500;
   }
   ```



##########
core/src/main/java/com/datastax/oss/driver/internal/core/config/cloud/CloudConfigFactory.java:
##########
@@ -225,22 +228,51 @@ protected TrustManagerFactory createTrustManagerFactory(
   @NonNull
   protected BufferedReader fetchProxyMetadata(
       @NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws 
IOException {
-    try {
-      HttpsURLConnection connection = (HttpsURLConnection) 
metadataServiceUrl.openConnection();
+    HttpsURLConnection connection = null;
+    int attempt = 0;
+    while (attempt < METADATA_RETRY_MAX_ATTEMPTS) {
+      attempt++;
+      connection = (HttpsURLConnection) metadataServiceUrl.openConnection();
       connection.setSSLSocketFactory(sslContext.getSocketFactory());
       connection.setRequestMethod("GET");
       connection.setRequestProperty("host", "localhost");
-      return new BufferedReader(
-          new InputStreamReader(connection.getInputStream(), 
StandardCharsets.UTF_8));
-    } catch (ConnectException e) {
-      throw new IllegalStateException(
-          "Unable to connect to cloud metadata service. Please make sure your 
cluster is not parked or terminated",
-          e);
-    } catch (UnknownHostException e) {
-      throw new IllegalStateException(
-          "Unable to resolve host for cloud metadata service. Please make sure 
your cluster is not terminated",
-          e);
+      // if this is the last attempt, throw
+      // else if the response code is 401, 421, or 5xx, retry
+      // else, throw
+      if (attempt < METADATA_RETRY_MAX_ATTEMPTS
+          && (connection.getResponseCode() == 401
+              || connection.getResponseCode() == 421
+              || connection.getResponseCode() >= 500)) {

Review Comment:
   Here (and in the other spots throughout this code) you'd be better off using 
the constants defined by HttpURLConnection such as HTTP_OK for 200 (and so on). 
 It makes the code a bit more descriptive about what's going on.



##########
core/src/main/java/com/datastax/oss/driver/internal/core/config/cloud/CloudConfigFactory.java:
##########
@@ -225,22 +228,51 @@ protected TrustManagerFactory createTrustManagerFactory(
   @NonNull
   protected BufferedReader fetchProxyMetadata(
       @NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws 
IOException {
-    try {
-      HttpsURLConnection connection = (HttpsURLConnection) 
metadataServiceUrl.openConnection();
+    HttpsURLConnection connection = null;
+    int attempt = 0;
+    while (attempt < METADATA_RETRY_MAX_ATTEMPTS) {
+      attempt++;
+      connection = (HttpsURLConnection) metadataServiceUrl.openConnection();
       connection.setSSLSocketFactory(sslContext.getSocketFactory());
       connection.setRequestMethod("GET");
       connection.setRequestProperty("host", "localhost");
-      return new BufferedReader(
-          new InputStreamReader(connection.getInputStream(), 
StandardCharsets.UTF_8));
-    } catch (ConnectException e) {
-      throw new IllegalStateException(
-          "Unable to connect to cloud metadata service. Please make sure your 
cluster is not parked or terminated",
-          e);
-    } catch (UnknownHostException e) {
-      throw new IllegalStateException(
-          "Unable to resolve host for cloud metadata service. Please make sure 
your cluster is not terminated",
-          e);
+      // if this is the last attempt, throw
+      // else if the response code is 401, 421, or 5xx, retry
+      // else, throw
+      if (attempt < METADATA_RETRY_MAX_ATTEMPTS
+          && (connection.getResponseCode() == 401
+              || connection.getResponseCode() == 421
+              || connection.getResponseCode() >= 500)) {

Review Comment:
   It's not immediately clear to me that this is the extent of response codes 
we want to handle here.
   
   I agree with retrying on any 5xx error.  It also probably does make sense to 
retry on a 401 and 421 given what we've seen from this service.  But there are 
a whole set of [4xx response 
codes](https://www.rfc-editor.org/rfc/rfc9110#name-client-error-4xx) all of 
which indicate that the client did something wrong (from the server's 
perspective).  We could see _any_ of these codes coming off of the metadata 
service... what do we want to do in that case?  We may need to spend some more 
time looking at this list and thinking about that.
   
   There are also a lot of [3xx 
errors](https://www.rfc-editor.org/rfc/rfc9110#name-redirection-3xx) but most 
of those involve the server telling the client that the thing it's asking for 
is somewhere else.  In that case retrying that specific request probably 
doesn't make sense... but I would expect a fully-formed HTTP client to send 
another (slightly different) request based on the feedback it got from the 
server.  We're not building a fully-formed HTTP client, though, so I think I'm 
okay with not retrying 3xx errors for now.
   
   There is part of me wondering if we shouldn't put a real HTTP client in here 
for this... 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to