absurdfarce commented on code in PR #1902: URL: https://github.com/apache/cassandra-java-driver/pull/1902#discussion_r1434162481
########## core/src/main/java/com/datastax/oss/driver/internal/core/config/cloud/CloudConfigFactory.java: ########## @@ -225,22 +228,51 @@ protected TrustManagerFactory createTrustManagerFactory( @NonNull protected BufferedReader fetchProxyMetadata( @NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws IOException { - try { - HttpsURLConnection connection = (HttpsURLConnection) metadataServiceUrl.openConnection(); + HttpsURLConnection connection = null; + int attempt = 0; + while (attempt < METADATA_RETRY_MAX_ATTEMPTS) { + attempt++; + connection = (HttpsURLConnection) metadataServiceUrl.openConnection(); connection.setSSLSocketFactory(sslContext.getSocketFactory()); connection.setRequestMethod("GET"); connection.setRequestProperty("host", "localhost"); - return new BufferedReader( - new InputStreamReader(connection.getInputStream(), StandardCharsets.UTF_8)); - } catch (ConnectException e) { - throw new IllegalStateException( - "Unable to connect to cloud metadata service. Please make sure your cluster is not parked or terminated", - e); - } catch (UnknownHostException e) { - throw new IllegalStateException( - "Unable to resolve host for cloud metadata service. Please make sure your cluster is not terminated", - e); + // if this is the last attempt, throw + // else if the response code is 401, 421, or 5xx, retry + // else, throw + if (attempt < METADATA_RETRY_MAX_ATTEMPTS + && (connection.getResponseCode() == 401 + || connection.getResponseCode() == 421 + || connection.getResponseCode() >= 500)) { + try { + Thread.sleep(METADATA_RETRY_INITIAL_DELAY_MS); + continue; + } catch (InterruptedException interruptedException) { + throw new IOException( + "Interrupted while waiting to retry metadata fetch", interruptedException); + } + } + + // last attempt, still 421 or 401 + if (connection.getResponseCode() == 421 || connection.getResponseCode() == 401) { Review Comment: Don't we want to throw an exception of some kind in this case as long as the response code isn't a 200? I'm trying to imagine a situation in which we would want to proceed with trying to get an InputStream from the connection if we got a 3xx, 4xx or 5xx response. There are other 2xx responses besides 200 ([the spec](https://www.rfc-editor.org/rfc/rfc9110#section-15.3) has some more details) but none of those seem to apply in our case. ########## core/src/main/java/com/datastax/oss/driver/internal/core/config/cloud/CloudConfigFactory.java: ########## @@ -225,22 +228,51 @@ protected TrustManagerFactory createTrustManagerFactory( @NonNull protected BufferedReader fetchProxyMetadata( @NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws IOException { - try { - HttpsURLConnection connection = (HttpsURLConnection) metadataServiceUrl.openConnection(); + HttpsURLConnection connection = null; + int attempt = 0; + while (attempt < METADATA_RETRY_MAX_ATTEMPTS) { + attempt++; + connection = (HttpsURLConnection) metadataServiceUrl.openConnection(); connection.setSSLSocketFactory(sslContext.getSocketFactory()); connection.setRequestMethod("GET"); connection.setRequestProperty("host", "localhost"); - return new BufferedReader( - new InputStreamReader(connection.getInputStream(), StandardCharsets.UTF_8)); - } catch (ConnectException e) { - throw new IllegalStateException( - "Unable to connect to cloud metadata service. Please make sure your cluster is not parked or terminated", - e); - } catch (UnknownHostException e) { - throw new IllegalStateException( - "Unable to resolve host for cloud metadata service. Please make sure your cluster is not terminated", - e); + // if this is the last attempt, throw + // else if the response code is 401, 421, or 5xx, retry + // else, throw + if (attempt < METADATA_RETRY_MAX_ATTEMPTS Review Comment: I think you'd be better off handling the attempt >= METADATA_RETRY_MAX_ATTEMPTS first. The benefit with that approach is that you're guaranteed that everything afterwards has not yet hit the max number of attempts (and is therefore subject to retry): ```java int rc = connection.getResponseCode(); if (attempt >= METADATA_RETRY_MAX_ATTEMPTS) { if (rc == HttpURLConnection.HTTP_OK) { try { return new BufferedReader( new InputStreamReader(connection.getInputStream(), StandardCharsets.UTF_8)); } catch (ConnectException e) { throw new IllegalStateException( "Unable to connect to cloud metadata service. Please make sure your cluster is not parked or terminated", e); } catch (UnknownHostException e) { throw new IllegalStateException( "Unable to resolve host for cloud metadata service. Please make sure your cluster is not terminated", e); } } else { // TODO: Might also want to add some logging here (or maybe above) about what the observed return code was throw new IllegalStateException( "Unable to fetch metadata from cloud metadata service. Please make sure your cluster is not parked or terminated"); } } } assert attempt < METADATA_RETRY_MAX_ATTEMPTS; ``` ########## core/src/main/java/com/datastax/oss/driver/internal/core/config/cloud/CloudConfigFactory.java: ########## @@ -225,22 +228,51 @@ protected TrustManagerFactory createTrustManagerFactory( @NonNull protected BufferedReader fetchProxyMetadata( @NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws IOException { - try { - HttpsURLConnection connection = (HttpsURLConnection) metadataServiceUrl.openConnection(); + HttpsURLConnection connection = null; + int attempt = 0; + while (attempt < METADATA_RETRY_MAX_ATTEMPTS) { + attempt++; + connection = (HttpsURLConnection) metadataServiceUrl.openConnection(); connection.setSSLSocketFactory(sslContext.getSocketFactory()); connection.setRequestMethod("GET"); connection.setRequestProperty("host", "localhost"); - return new BufferedReader( - new InputStreamReader(connection.getInputStream(), StandardCharsets.UTF_8)); - } catch (ConnectException e) { - throw new IllegalStateException( - "Unable to connect to cloud metadata service. Please make sure your cluster is not parked or terminated", - e); - } catch (UnknownHostException e) { - throw new IllegalStateException( - "Unable to resolve host for cloud metadata service. Please make sure your cluster is not terminated", - e); + // if this is the last attempt, throw + // else if the response code is 401, 421, or 5xx, retry + // else, throw + if (attempt < METADATA_RETRY_MAX_ATTEMPTS + && (connection.getResponseCode() == 401 + || connection.getResponseCode() == 421 + || connection.getResponseCode() >= 500)) { Review Comment: This logic should also probably be pulled out into a simple helper method, something like this: ```java private boolean shouldRetry(int rc) { return connection.getResponseCode() == 401 || connection.getResponseCode() == 421 || connection.getResponseCode() >= 500; } ``` ########## core/src/main/java/com/datastax/oss/driver/internal/core/config/cloud/CloudConfigFactory.java: ########## @@ -225,22 +228,51 @@ protected TrustManagerFactory createTrustManagerFactory( @NonNull protected BufferedReader fetchProxyMetadata( @NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws IOException { - try { - HttpsURLConnection connection = (HttpsURLConnection) metadataServiceUrl.openConnection(); + HttpsURLConnection connection = null; + int attempt = 0; + while (attempt < METADATA_RETRY_MAX_ATTEMPTS) { + attempt++; + connection = (HttpsURLConnection) metadataServiceUrl.openConnection(); connection.setSSLSocketFactory(sslContext.getSocketFactory()); connection.setRequestMethod("GET"); connection.setRequestProperty("host", "localhost"); - return new BufferedReader( - new InputStreamReader(connection.getInputStream(), StandardCharsets.UTF_8)); - } catch (ConnectException e) { - throw new IllegalStateException( - "Unable to connect to cloud metadata service. Please make sure your cluster is not parked or terminated", - e); - } catch (UnknownHostException e) { - throw new IllegalStateException( - "Unable to resolve host for cloud metadata service. Please make sure your cluster is not terminated", - e); + // if this is the last attempt, throw + // else if the response code is 401, 421, or 5xx, retry + // else, throw + if (attempt < METADATA_RETRY_MAX_ATTEMPTS + && (connection.getResponseCode() == 401 + || connection.getResponseCode() == 421 + || connection.getResponseCode() >= 500)) { Review Comment: Here (and in the other spots throughout this code) you'd be better off using the constants defined by HttpURLConnection such as HTTP_OK for 200 (and so on). It makes the code a bit more descriptive about what's going on. ########## core/src/main/java/com/datastax/oss/driver/internal/core/config/cloud/CloudConfigFactory.java: ########## @@ -225,22 +228,51 @@ protected TrustManagerFactory createTrustManagerFactory( @NonNull protected BufferedReader fetchProxyMetadata( @NonNull URL metadataServiceUrl, @NonNull SSLContext sslContext) throws IOException { - try { - HttpsURLConnection connection = (HttpsURLConnection) metadataServiceUrl.openConnection(); + HttpsURLConnection connection = null; + int attempt = 0; + while (attempt < METADATA_RETRY_MAX_ATTEMPTS) { + attempt++; + connection = (HttpsURLConnection) metadataServiceUrl.openConnection(); connection.setSSLSocketFactory(sslContext.getSocketFactory()); connection.setRequestMethod("GET"); connection.setRequestProperty("host", "localhost"); - return new BufferedReader( - new InputStreamReader(connection.getInputStream(), StandardCharsets.UTF_8)); - } catch (ConnectException e) { - throw new IllegalStateException( - "Unable to connect to cloud metadata service. Please make sure your cluster is not parked or terminated", - e); - } catch (UnknownHostException e) { - throw new IllegalStateException( - "Unable to resolve host for cloud metadata service. Please make sure your cluster is not terminated", - e); + // if this is the last attempt, throw + // else if the response code is 401, 421, or 5xx, retry + // else, throw + if (attempt < METADATA_RETRY_MAX_ATTEMPTS + && (connection.getResponseCode() == 401 + || connection.getResponseCode() == 421 + || connection.getResponseCode() >= 500)) { Review Comment: It's not immediately clear to me that this is the extent of response codes we want to handle here. I agree with retrying on any 5xx error. It also probably does make sense to retry on a 401 and 421 given what we've seen from this service. But there are a whole set of [4xx response codes](https://www.rfc-editor.org/rfc/rfc9110#name-client-error-4xx) all of which indicate that the client did something wrong (from the server's perspective). We could see _any_ of these codes coming off of the metadata service... what do we want to do in that case? We may need to spend some more time looking at this list and thinking about that. There are also a lot of [3xx errors](https://www.rfc-editor.org/rfc/rfc9110#name-redirection-3xx) but most of those involve the server telling the client that the thing it's asking for is somewhere else. In that case retrying that specific request probably doesn't make sense... but I would expect a fully-formed HTTP client to send another (slightly different) request based on the feedback it got from the server. We're not building a fully-formed HTTP client, though, so I think I'm okay with not retrying 3xx errors for now. There is part of me wondering if we shouldn't put a real HTTP client in here for this... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org