andrew4699 commented on code in PR #864:
URL: https://github.com/apache/polaris/pull/864#discussion_r1929312972


##########
service/common/src/main/java/org/apache/polaris/service/exception/IcebergExceptionMapper.java:
##########
@@ -144,4 +112,56 @@ public static boolean containsAnyAccessDeniedHint(String 
message) {
   public static Collection<String> getAccessDeniedHints() {
     return ImmutableSet.copyOf(ACCESS_DENIED_HINTS);
   }
+
+  static int mapExceptionToResponseCode(RuntimeException rex) {
+    // Cloud exceptions
+    if (rex instanceof S3Exception
+        || rex instanceof AzureException
+        || rex instanceof StorageException) {
+      if (doesAnyThrowableContainAccessDeniedHint(rex)) {
+        return Response.Status.FORBIDDEN.getStatusCode();
+      }
+
+      int httpCode =
+          switch (rex) {
+            case S3Exception s3e -> s3e.statusCode();
+            case HttpResponseException hre -> 
hre.getResponse().getStatusCode();
+            case StorageException se -> se.getCode();
+            default -> -1;
+          };
+
+      if (300 <= httpCode && httpCode <= 499) {
+        return httpCode;

Review Comment:
   > (Tangential comment about zone IDs, etc.: Ideally storage config should be 
validated at catalog creation time. Then, at REST Catalog request time, the 
server will be able to assume the config is valid).
   
   Agree this would be a nice feature. It unfortunately wouldn't be sufficient 
as the information can become invalid after creation, ie due to the user making 
changes in AWS.
   
   > This matches 500 fairly well, but not all 5xx errors, IMHO.
   > 
   > That said, what is "client-provided information"? In the context of this 
exception mapper, if the client is the REST Catalog, for example, it did not 
provide catalog configuration as part of the request it sent.
   > 
   > I suppose it may be necessary to have different exception mappers for the 
REST Catalog API and the Management API for fine-tuning error responses. WDYT?
   
   I think these are currently only relevant to the REST Catalog API as the 
Management API doesn't currently call cloud storage. You're correct that the 
client didn't provide the information in this request, but they did in a 
previous request.
   
   The client/server model is tricky in Polaris. The client/server jointly own 
a dependency of the server: cloud storage resources. It's the client's 
responsibility to give the server the correct permissions (to assume the role 
and access the bucket), and then inform the server of the correct role/bucket 
name. The server has to expose a way for the client to do that, for example by 
providing a user/role identifier for the client to give those permissions to.
   
   The difference between 4xx and 5xx is generally whose fault it is. The 
client owns the bucket, so in absence of a code bug it's generally the client's 
fault (and responsibility to fix) that the bucket is inaccessible.
   
   > A 3xx response indicates the client should re-sent the request to a 
different URL. Again, this does not make sense if the reason is catalog (mis-) 
configuration.
   
   I agree the 3xx case is odd. IMO it would make sense to map those to 400 as 
the solution probably isn't a simple redirect.
   
   > I do not think we should consider Polaris as a proxy for storage. Even 
though today it loads metadata from S3 (for example) on every request, it does 
not have to be this way in principle... I believe I saw discussions about 
storing metadata in the Polaris database.
   
   If that happened then wouldn't this entirely be a non-issue? There will 
presumably still be requests to storage in which case we should apply some 
principle.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to