steveloughran commented on code in PR #6407:
URL: https://github.com/apache/hadoop/pull/6407#discussion_r1478954399


##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/StoreContext.java:
##########
@@ -25,6 +25,7 @@
 import java.util.concurrent.CompletableFuture;
 import java.util.concurrent.ExecutorService;
 
+import org.apache.hadoop.fs.s3a.S3ObjectStorageClassFilter;

Review Comment:
   can you move down to the rest of the org.apache. 
   these guava things are in the wrong block due to the big search and replace 
which created them



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##########
@@ -441,6 +441,8 @@ public class S3AFileSystem extends FileSystem implements 
StreamCapabilities,
    */
   private boolean isCSEEnabled;
 
+  private S3ObjectStorageClassFilter s3ObjectStorageClassFilter;

Review Comment:
   nit: add a javadoc -and remember a "." at the end to keep all javadoc 
versions happy



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ObjectStorageClassFilter.java:
##########
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import org.apache.hadoop.thirdparty.com.google.common.collect.Sets;
+import java.util.Set;
+import java.util.function.Function;
+import software.amazon.awssdk.services.s3.model.ObjectStorageClass;
+import software.amazon.awssdk.services.s3.model.S3Object;
+
+
+/**
+ * <pre>
+ * {@link S3ObjectStorageClassFilter} will filter the S3 files based on the 
{@code fs.s3a.glacier.read.restored.objects} configuration set in {@link 
S3AFileSystem}
+ * The config can have 3 values:
+ * {@code READ_ALL}: Retrieval of Glacier files will fail with 
InvalidObjectStateException: The operation is not valid for the object's 
storage class.
+ * {@code SKIP_ALL_GLACIER}: If this value is set then this will ignore any S3 
Objects which are tagged with Glacier storage classes and retrieve the others.
+ * {@code READ_RESTORED_GLACIER_OBJECTS}: If this value is set then restored 
status of the Glacier object will be checked, if restored the objects would be 
read like normal S3 objects else they will be ignored as the objects would not 
have been retrieved from the S3 Glacier.
+ * </pre>
+ */
+public enum S3ObjectStorageClassFilter {
+  READ_ALL(o -> true),
+  SKIP_ALL_GLACIER(S3ObjectStorageClassFilter::isNotGlacierObject),
+  
READ_RESTORED_GLACIER_OBJECTS(S3ObjectStorageClassFilter::isCompletedRestoredObject);
+
+  private static final Set<ObjectStorageClass> GLACIER_STORAGE_CLASSES = 
Sets.newHashSet(ObjectStorageClass.GLACIER, ObjectStorageClass.DEEP_ARCHIVE);
+
+  private final Function<S3Object, Boolean> filter;
+
+  S3ObjectStorageClassFilter(Function<S3Object, Boolean> filter) {
+    this.filter = filter;
+  }
+
+  private static boolean isNotGlacierObject(S3Object object) {

Review Comment:
   add javadocs all the way down here, thanks



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##########
@@ -581,6 +583,12 @@ public void initialize(URI name, Configuration 
originalConf)
 
       s3aInternals = createS3AInternals();
 
+      s3ObjectStorageClassFilter = 
Optional.ofNullable(conf.get(READ_RESTORED_GLACIER_OBJECTS))

Review Comment:
   @ahmarsuhail but doing the the way it is does handle case differences.
   
   I'd go for getTrimmed(READ_RESTORED_GLACIER_OBJECTS, ""); if empty string 
map to empty optional, otherwise .toupper and valueof. one thing to consider: 
meaningful failure if the value doesn't map.
   
   I'd change Configuration to do that case mapping if it wasn't such a 
critical class



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##########
@@ -581,6 +583,12 @@ public void initialize(URI name, Configuration 
originalConf)
 
       s3aInternals = createS3AInternals();
 
+      s3ObjectStorageClassFilter = 
Optional.ofNullable(conf.get(READ_RESTORED_GLACIER_OBJECTS))

Review Comment:
   or we just go for "upper case is required" and use what you've proposed. 
more brittle but simpler?



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/list/ITestS3AReadRestoredGlacierObjects.java:
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.list;
+
+import static org.apache.hadoop.fs.s3a.Constants.READ_RESTORED_GLACIER_OBJECTS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_DEEP_ARCHIVE;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_GLACIER;
+import static org.apache.hadoop.fs.s3a.S3ATestUtils.disableFilesystemCaching;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.skipIfStorageClassTestsDisabled;
+import static 
org.apache.hadoop.fs.s3a.audit.S3AAuditConstants.REJECT_OUT_OF_SPAN_OPERATIONS;
+
+import java.util.Arrays;
+import java.util.Collection;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+import org.apache.hadoop.fs.contract.s3a.S3AContract;
+import org.apache.hadoop.fs.s3a.AbstractS3ATestBase;
+import org.apache.hadoop.fs.s3a.S3ListRequest;
+import org.apache.hadoop.fs.s3a.S3ObjectStorageClassFilter;
+import org.assertj.core.api.Assertions;
+import org.junit.Assume;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import software.amazon.awssdk.services.s3.S3Client;
+import software.amazon.awssdk.services.s3.model.GlacierJobParameters;
+import software.amazon.awssdk.services.s3.model.ObjectStorageClass;
+import software.amazon.awssdk.services.s3.model.RestoreObjectRequest;
+import software.amazon.awssdk.services.s3.model.RestoreRequest;
+import software.amazon.awssdk.services.s3.model.S3Object;
+import software.amazon.awssdk.services.s3.model.Tier;
+
+@RunWith(Parameterized.class)
+public class ITestS3AReadRestoredGlacierObjects extends AbstractS3ATestBase {
+
+  enum Type { GLACIER_AND_DEEP_ARCHIVE, GLACIER }
+
+  @Parameterized.Parameters
+  public static Collection<Object[]> data(){
+    return Arrays.asList(new Object[][] {
+        {Type.GLACIER_AND_DEEP_ARCHIVE, STORAGE_CLASS_GLACIER},
+        {Type.GLACIER_AND_DEEP_ARCHIVE, STORAGE_CLASS_DEEP_ARCHIVE},
+        {Type.GLACIER, STORAGE_CLASS_GLACIER}
+    });
+  }
+
+  private int retryCount = 0;
+  private final int MAX_RETRIES = 100;
+  private final int RETRY_DELAY_MS = 5000;
+
+  private Type type;
+  private String glacierClass;
+
+  public ITestS3AReadRestoredGlacierObjects(Type type, String glacierClass) {
+    this.type = type;
+    this.glacierClass = glacierClass;
+  }
+
+  private FileSystem createFiles(String s3ObjectStorageClassFilter) throws 
Throwable {
+    Configuration conf = this.createConfiguration();
+    conf.set(READ_RESTORED_GLACIER_OBJECTS, s3ObjectStorageClassFilter);
+    conf.set(STORAGE_CLASS, glacierClass); // Create Glacier objects:Storage 
Class:DEEP_ARCHIVE/GLACIER
+    S3AContract contract = (S3AContract) createContract(conf);
+    contract.init();
+
+    FileSystem fs = contract.getTestFileSystem();
+    Path dir = methodPath();
+    fs.mkdirs(dir);
+    Path path = new Path(dir, "file1");
+    ContractTestUtils.touch(fs, path);
+    return fs;
+  }
+
+  @Override
+  protected Configuration createConfiguration() {
+    Configuration newConf = super.createConfiguration();
+    skipIfStorageClassTestsDisabled(newConf);
+    disableFilesystemCaching(newConf);
+    removeBaseAndBucketOverrides(newConf, STORAGE_CLASS);
+    newConf.set(REJECT_OUT_OF_SPAN_OPERATIONS, "false");
+    return newConf;
+  }
+
+  @Test
+  public void testIgnoreGlacierObject() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER_AND_DEEP_ARCHIVE);
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.SKIP_ALL_GLACIER.name())) {
+      Assertions.assertThat(
+          fs.listStatus(methodPath()))
+        .describedAs("FileStatus List of %s", methodPath()).isEmpty();
+    }
+  }
+
+  @Test
+  public void testIgnoreRestoringGlacierObject() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER_AND_DEEP_ARCHIVE);
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.READ_RESTORED_GLACIER_OBJECTS.name())) {
+      Assertions.assertThat(
+              fs.listStatus(
+                  methodPath()))
+          .describedAs("FileStatus List of %s", methodPath()).isEmpty();
+    }
+  }
+
+  @Test
+  public void testRestoredGlacierObject() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER); // Skipping this test for Deep 
Archive as expedited retrieval is not supported
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.READ_RESTORED_GLACIER_OBJECTS.name())) {
+      restoreGlacierObject(methodPath().toUri().getHost(), 
getFilePrefixForListObjects() + "file1", 2);
+      Assertions.assertThat(
+              fs.listStatus(
+                  methodPath()))
+          .describedAs("FileStatus List of %s", methodPath()).isNotEmpty();
+    }
+  }
+
+  @Test
+  public void testDefault() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER_AND_DEEP_ARCHIVE);
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.READ_ALL.name())) {
+      Assertions.assertThat(
+              fs.listStatus(methodPath()))
+          .describedAs("FileStatus List of %s", methodPath()).isNotEmpty();
+    }
+  }
+
+
+  private void restoreGlacierObject(String bucketName, String 
glacierObjectKey, int expirationDays) {
+
+    S3Client s3Client = 
getFileSystem().getS3AInternals().getAmazonS3Client("test");
+
+    // Create a restore object request
+    RestoreObjectRequest requestRestore = RestoreObjectRequest.builder()
+        .bucket(bucketName)
+        .key(glacierObjectKey)
+        .restoreRequest(
+            RestoreRequest.builder().glacierJobParameters(
+                GlacierJobParameters.builder()
+                    .tier(Tier.EXPEDITED)
+                    .build()).days(expirationDays)
+                .build())
+        .build();
+
+    s3Client.restoreObject(requestRestore);
+
+    // fetch the glacier object
+    S3ListRequest s3ListRequest = getFileSystem().createListObjectsRequest(
+        getFilePrefixForListObjects(), "/");
+    S3Object s3GlacierObject =  getS3GlacierObject(s3Client, s3ListRequest);
+
+    while ((s3GlacierObject != null && 
s3GlacierObject.restoreStatus().isRestoreInProgress()) && retryCount < 
MAX_RETRIES) {
+      // Wait for few seconds before checking again
+      try {
+        Thread.sleep(RETRY_DELAY_MS);
+        retryCount++;
+      } catch (Exception e) {
+        throw new RuntimeException(e);
+      }
+      s3GlacierObject =  getS3GlacierObject(s3Client, s3ListRequest);;
+    }
+
+    if (retryCount >= MAX_RETRIES){
+      throw new RuntimeException("The restore process exceeded the maximum 
allowed time.");
+    }
+  }
+
+
+  private String getFilePrefixForListObjects() {
+    return getContract().getTestPath().getName() + "/" + 
methodName.getMethodName() + "/";

Review Comment:
   use methodPath() and go from there...s3a fs can map from paths to keys



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/list/ITestS3AReadRestoredGlacierObjects.java:
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.list;
+
+import static org.apache.hadoop.fs.s3a.Constants.READ_RESTORED_GLACIER_OBJECTS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_DEEP_ARCHIVE;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_GLACIER;
+import static org.apache.hadoop.fs.s3a.S3ATestUtils.disableFilesystemCaching;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.skipIfStorageClassTestsDisabled;
+import static 
org.apache.hadoop.fs.s3a.audit.S3AAuditConstants.REJECT_OUT_OF_SPAN_OPERATIONS;
+
+import java.util.Arrays;
+import java.util.Collection;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+import org.apache.hadoop.fs.contract.s3a.S3AContract;
+import org.apache.hadoop.fs.s3a.AbstractS3ATestBase;
+import org.apache.hadoop.fs.s3a.S3ListRequest;
+import org.apache.hadoop.fs.s3a.S3ObjectStorageClassFilter;
+import org.assertj.core.api.Assertions;
+import org.junit.Assume;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import software.amazon.awssdk.services.s3.S3Client;
+import software.amazon.awssdk.services.s3.model.GlacierJobParameters;
+import software.amazon.awssdk.services.s3.model.ObjectStorageClass;
+import software.amazon.awssdk.services.s3.model.RestoreObjectRequest;
+import software.amazon.awssdk.services.s3.model.RestoreRequest;
+import software.amazon.awssdk.services.s3.model.S3Object;
+import software.amazon.awssdk.services.s3.model.Tier;
+
+@RunWith(Parameterized.class)
+public class ITestS3AReadRestoredGlacierObjects extends AbstractS3ATestBase {

Review Comment:
   again, javadoc here



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/list/ITestS3AReadRestoredGlacierObjects.java:
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.list;
+
+import static org.apache.hadoop.fs.s3a.Constants.READ_RESTORED_GLACIER_OBJECTS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_DEEP_ARCHIVE;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_GLACIER;
+import static org.apache.hadoop.fs.s3a.S3ATestUtils.disableFilesystemCaching;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.skipIfStorageClassTestsDisabled;
+import static 
org.apache.hadoop.fs.s3a.audit.S3AAuditConstants.REJECT_OUT_OF_SPAN_OPERATIONS;
+
+import java.util.Arrays;
+import java.util.Collection;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+import org.apache.hadoop.fs.contract.s3a.S3AContract;
+import org.apache.hadoop.fs.s3a.AbstractS3ATestBase;
+import org.apache.hadoop.fs.s3a.S3ListRequest;
+import org.apache.hadoop.fs.s3a.S3ObjectStorageClassFilter;
+import org.assertj.core.api.Assertions;
+import org.junit.Assume;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import software.amazon.awssdk.services.s3.S3Client;
+import software.amazon.awssdk.services.s3.model.GlacierJobParameters;
+import software.amazon.awssdk.services.s3.model.ObjectStorageClass;
+import software.amazon.awssdk.services.s3.model.RestoreObjectRequest;
+import software.amazon.awssdk.services.s3.model.RestoreRequest;
+import software.amazon.awssdk.services.s3.model.S3Object;
+import software.amazon.awssdk.services.s3.model.Tier;
+
+@RunWith(Parameterized.class)
+public class ITestS3AReadRestoredGlacierObjects extends AbstractS3ATestBase {
+
+  enum Type { GLACIER_AND_DEEP_ARCHIVE, GLACIER }
+
+  @Parameterized.Parameters
+  public static Collection<Object[]> data(){
+    return Arrays.asList(new Object[][] {
+        {Type.GLACIER_AND_DEEP_ARCHIVE, STORAGE_CLASS_GLACIER},
+        {Type.GLACIER_AND_DEEP_ARCHIVE, STORAGE_CLASS_DEEP_ARCHIVE},
+        {Type.GLACIER, STORAGE_CLASS_GLACIER}
+    });
+  }
+
+  private int retryCount = 0;
+  private final int MAX_RETRIES = 100;
+  private final int RETRY_DELAY_MS = 5000;
+
+  private Type type;
+  private String glacierClass;
+
+  public ITestS3AReadRestoredGlacierObjects(Type type, String glacierClass) {
+    this.type = type;
+    this.glacierClass = glacierClass;
+  }
+
+  private FileSystem createFiles(String s3ObjectStorageClassFilter) throws 
Throwable {
+    Configuration conf = this.createConfiguration();
+    conf.set(READ_RESTORED_GLACIER_OBJECTS, s3ObjectStorageClassFilter);
+    conf.set(STORAGE_CLASS, glacierClass); // Create Glacier objects:Storage 
Class:DEEP_ARCHIVE/GLACIER
+    S3AContract contract = (S3AContract) createContract(conf);
+    contract.init();
+
+    FileSystem fs = contract.getTestFileSystem();
+    Path dir = methodPath();
+    fs.mkdirs(dir);
+    Path path = new Path(dir, "file1");
+    ContractTestUtils.touch(fs, path);
+    return fs;
+  }
+
+  @Override
+  protected Configuration createConfiguration() {
+    Configuration newConf = super.createConfiguration();
+    skipIfStorageClassTestsDisabled(newConf);
+    disableFilesystemCaching(newConf);
+    removeBaseAndBucketOverrides(newConf, STORAGE_CLASS);
+    newConf.set(REJECT_OUT_OF_SPAN_OPERATIONS, "false");
+    return newConf;
+  }
+
+  @Test
+  public void testIgnoreGlacierObject() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER_AND_DEEP_ARCHIVE);
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.SKIP_ALL_GLACIER.name())) {
+      Assertions.assertThat(
+          fs.listStatus(methodPath()))
+        .describedAs("FileStatus List of %s", methodPath()).isEmpty();
+    }
+  }
+
+  @Test
+  public void testIgnoreRestoringGlacierObject() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER_AND_DEEP_ARCHIVE);
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.READ_RESTORED_GLACIER_OBJECTS.name())) {
+      Assertions.assertThat(
+              fs.listStatus(
+                  methodPath()))
+          .describedAs("FileStatus List of %s", methodPath()).isEmpty();
+    }
+  }
+
+  @Test
+  public void testRestoredGlacierObject() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER); // Skipping this test for Deep 
Archive as expedited retrieval is not supported
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.READ_RESTORED_GLACIER_OBJECTS.name())) {
+      restoreGlacierObject(methodPath().toUri().getHost(), 
getFilePrefixForListObjects() + "file1", 2);
+      Assertions.assertThat(
+              fs.listStatus(
+                  methodPath()))
+          .describedAs("FileStatus List of %s", methodPath()).isNotEmpty();
+    }
+  }
+
+  @Test
+  public void testDefault() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER_AND_DEEP_ARCHIVE);
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.READ_ALL.name())) {
+      Assertions.assertThat(
+              fs.listStatus(methodPath()))
+          .describedAs("FileStatus List of %s", methodPath()).isNotEmpty();
+    }
+  }
+
+
+  private void restoreGlacierObject(String bucketName, String 
glacierObjectKey, int expirationDays) {
+
+    S3Client s3Client = 
getFileSystem().getS3AInternals().getAmazonS3Client("test");
+
+    // Create a restore object request
+    RestoreObjectRequest requestRestore = RestoreObjectRequest.builder()
+        .bucket(bucketName)
+        .key(glacierObjectKey)
+        .restoreRequest(
+            RestoreRequest.builder().glacierJobParameters(
+                GlacierJobParameters.builder()
+                    .tier(Tier.EXPEDITED)
+                    .build()).days(expirationDays)
+                .build())
+        .build();
+
+    s3Client.restoreObject(requestRestore);
+
+    // fetch the glacier object
+    S3ListRequest s3ListRequest = getFileSystem().createListObjectsRequest(
+        getFilePrefixForListObjects(), "/");
+    S3Object s3GlacierObject =  getS3GlacierObject(s3Client, s3ListRequest);
+
+    while ((s3GlacierObject != null && 
s3GlacierObject.restoreStatus().isRestoreInProgress()) && retryCount < 
MAX_RETRIES) {

Review Comment:
   `LambdaTestUtils.await()` is designed to handle this.



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/list/ITestS3AReadRestoredGlacierObjects.java:
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.list;
+
+import static org.apache.hadoop.fs.s3a.Constants.READ_RESTORED_GLACIER_OBJECTS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_DEEP_ARCHIVE;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_GLACIER;
+import static org.apache.hadoop.fs.s3a.S3ATestUtils.disableFilesystemCaching;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.skipIfStorageClassTestsDisabled;
+import static 
org.apache.hadoop.fs.s3a.audit.S3AAuditConstants.REJECT_OUT_OF_SPAN_OPERATIONS;
+
+import java.util.Arrays;
+import java.util.Collection;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+import org.apache.hadoop.fs.contract.s3a.S3AContract;
+import org.apache.hadoop.fs.s3a.AbstractS3ATestBase;
+import org.apache.hadoop.fs.s3a.S3ListRequest;
+import org.apache.hadoop.fs.s3a.S3ObjectStorageClassFilter;
+import org.assertj.core.api.Assertions;
+import org.junit.Assume;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import software.amazon.awssdk.services.s3.S3Client;
+import software.amazon.awssdk.services.s3.model.GlacierJobParameters;
+import software.amazon.awssdk.services.s3.model.ObjectStorageClass;
+import software.amazon.awssdk.services.s3.model.RestoreObjectRequest;
+import software.amazon.awssdk.services.s3.model.RestoreRequest;
+import software.amazon.awssdk.services.s3.model.S3Object;
+import software.amazon.awssdk.services.s3.model.Tier;
+
+@RunWith(Parameterized.class)
+public class ITestS3AReadRestoredGlacierObjects extends AbstractS3ATestBase {
+
+  enum Type { GLACIER_AND_DEEP_ARCHIVE, GLACIER }
+
+  @Parameterized.Parameters
+  public static Collection<Object[]> data(){
+    return Arrays.asList(new Object[][] {
+        {Type.GLACIER_AND_DEEP_ARCHIVE, STORAGE_CLASS_GLACIER},
+        {Type.GLACIER_AND_DEEP_ARCHIVE, STORAGE_CLASS_DEEP_ARCHIVE},
+        {Type.GLACIER, STORAGE_CLASS_GLACIER}
+    });
+  }
+
+  private int retryCount = 0;
+  private final int MAX_RETRIES = 100;
+  private final int RETRY_DELAY_MS = 5000;
+
+  private Type type;
+  private String glacierClass;
+
+  public ITestS3AReadRestoredGlacierObjects(Type type, String glacierClass) {
+    this.type = type;
+    this.glacierClass = glacierClass;
+  }
+
+  private FileSystem createFiles(String s3ObjectStorageClassFilter) throws 
Throwable {
+    Configuration conf = this.createConfiguration();
+    conf.set(READ_RESTORED_GLACIER_OBJECTS, s3ObjectStorageClassFilter);
+    conf.set(STORAGE_CLASS, glacierClass); // Create Glacier objects:Storage 
Class:DEEP_ARCHIVE/GLACIER
+    S3AContract contract = (S3AContract) createContract(conf);
+    contract.init();
+
+    FileSystem fs = contract.getTestFileSystem();
+    Path dir = methodPath();
+    fs.mkdirs(dir);
+    Path path = new Path(dir, "file1");
+    ContractTestUtils.touch(fs, path);
+    return fs;
+  }
+
+  @Override
+  protected Configuration createConfiguration() {
+    Configuration newConf = super.createConfiguration();
+    skipIfStorageClassTestsDisabled(newConf);
+    disableFilesystemCaching(newConf);
+    removeBaseAndBucketOverrides(newConf, STORAGE_CLASS);
+    newConf.set(REJECT_OUT_OF_SPAN_OPERATIONS, "false");
+    return newConf;
+  }
+
+  @Test
+  public void testIgnoreGlacierObject() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER_AND_DEEP_ARCHIVE);
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.SKIP_ALL_GLACIER.name())) {
+      Assertions.assertThat(
+          fs.listStatus(methodPath()))
+        .describedAs("FileStatus List of %s", methodPath()).isEmpty();
+    }
+  }
+
+  @Test
+  public void testIgnoreRestoringGlacierObject() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER_AND_DEEP_ARCHIVE);
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.READ_RESTORED_GLACIER_OBJECTS.name())) {
+      Assertions.assertThat(
+              fs.listStatus(
+                  methodPath()))
+          .describedAs("FileStatus List of %s", methodPath()).isEmpty();
+    }
+  }
+
+  @Test
+  public void testRestoredGlacierObject() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER); // Skipping this test for Deep 
Archive as expedited retrieval is not supported
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.READ_RESTORED_GLACIER_OBJECTS.name())) {
+      restoreGlacierObject(methodPath().toUri().getHost(), 
getFilePrefixForListObjects() + "file1", 2);
+      Assertions.assertThat(
+              fs.listStatus(
+                  methodPath()))
+          .describedAs("FileStatus List of %s", methodPath()).isNotEmpty();
+    }
+  }
+
+  @Test
+  public void testDefault() throws Throwable {
+    Assume.assumeTrue(type == Type.GLACIER_AND_DEEP_ARCHIVE);
+    try (FileSystem fs = 
createFiles(S3ObjectStorageClassFilter.READ_ALL.name())) {
+      Assertions.assertThat(
+              fs.listStatus(methodPath()))
+          .describedAs("FileStatus List of %s", methodPath()).isNotEmpty();
+    }
+  }
+
+
+  private void restoreGlacierObject(String bucketName, String 
glacierObjectKey, int expirationDays) {
+
+    S3Client s3Client = 
getFileSystem().getS3AInternals().getAmazonS3Client("test");
+
+    // Create a restore object request
+    RestoreObjectRequest requestRestore = RestoreObjectRequest.builder()

Review Comment:
   prefer this was in the RequestFactory interface and builder, as it'll let us 
do things like add audit context and anything else in future



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/list/ITestS3AReadRestoredGlacierObjects.java:
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.list;
+
+import static org.apache.hadoop.fs.s3a.Constants.READ_RESTORED_GLACIER_OBJECTS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_DEEP_ARCHIVE;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_GLACIER;
+import static org.apache.hadoop.fs.s3a.S3ATestUtils.disableFilesystemCaching;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.skipIfStorageClassTestsDisabled;
+import static 
org.apache.hadoop.fs.s3a.audit.S3AAuditConstants.REJECT_OUT_OF_SPAN_OPERATIONS;
+
+import java.util.Arrays;
+import java.util.Collection;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+import org.apache.hadoop.fs.contract.s3a.S3AContract;
+import org.apache.hadoop.fs.s3a.AbstractS3ATestBase;
+import org.apache.hadoop.fs.s3a.S3ListRequest;
+import org.apache.hadoop.fs.s3a.S3ObjectStorageClassFilter;
+import org.assertj.core.api.Assertions;
+import org.junit.Assume;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import software.amazon.awssdk.services.s3.S3Client;
+import software.amazon.awssdk.services.s3.model.GlacierJobParameters;
+import software.amazon.awssdk.services.s3.model.ObjectStorageClass;
+import software.amazon.awssdk.services.s3.model.RestoreObjectRequest;
+import software.amazon.awssdk.services.s3.model.RestoreRequest;
+import software.amazon.awssdk.services.s3.model.S3Object;
+import software.amazon.awssdk.services.s3.model.Tier;
+
+@RunWith(Parameterized.class)
+public class ITestS3AReadRestoredGlacierObjects extends AbstractS3ATestBase {
+
+  enum Type { GLACIER_AND_DEEP_ARCHIVE, GLACIER }
+
+  @Parameterized.Parameters

Review Comment:
   * look at other uses of this to see how we generate useful strings for logs
   * be aware the pattern is used in the method path, so musn't create invalid 
paths. it's just text is so much better than [0]



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/list/ITestS3AReadRestoredGlacierObjects.java:
##########
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.list;
+
+import static org.apache.hadoop.fs.s3a.Constants.READ_RESTORED_GLACIER_OBJECTS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_DEEP_ARCHIVE;
+import static org.apache.hadoop.fs.s3a.Constants.STORAGE_CLASS_GLACIER;
+import static org.apache.hadoop.fs.s3a.S3ATestUtils.disableFilesystemCaching;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides;
+import static 
org.apache.hadoop.fs.s3a.S3ATestUtils.skipIfStorageClassTestsDisabled;
+import static 
org.apache.hadoop.fs.s3a.audit.S3AAuditConstants.REJECT_OUT_OF_SPAN_OPERATIONS;
+
+import java.util.Arrays;
+import java.util.Collection;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+import org.apache.hadoop.fs.contract.s3a.S3AContract;
+import org.apache.hadoop.fs.s3a.AbstractS3ATestBase;
+import org.apache.hadoop.fs.s3a.S3ListRequest;
+import org.apache.hadoop.fs.s3a.S3ObjectStorageClassFilter;
+import org.assertj.core.api.Assertions;
+import org.junit.Assume;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import software.amazon.awssdk.services.s3.S3Client;
+import software.amazon.awssdk.services.s3.model.GlacierJobParameters;
+import software.amazon.awssdk.services.s3.model.ObjectStorageClass;
+import software.amazon.awssdk.services.s3.model.RestoreObjectRequest;
+import software.amazon.awssdk.services.s3.model.RestoreRequest;
+import software.amazon.awssdk.services.s3.model.S3Object;
+import software.amazon.awssdk.services.s3.model.Tier;
+
+@RunWith(Parameterized.class)
+public class ITestS3AReadRestoredGlacierObjects extends AbstractS3ATestBase {
+
+  enum Type { GLACIER_AND_DEEP_ARCHIVE, GLACIER }
+
+  @Parameterized.Parameters
+  public static Collection<Object[]> data(){
+    return Arrays.asList(new Object[][] {
+        {Type.GLACIER_AND_DEEP_ARCHIVE, STORAGE_CLASS_GLACIER},
+        {Type.GLACIER_AND_DEEP_ARCHIVE, STORAGE_CLASS_DEEP_ARCHIVE},
+        {Type.GLACIER, STORAGE_CLASS_GLACIER}
+    });
+  }
+
+  private int retryCount = 0;
+  private final int MAX_RETRIES = 100;
+  private final int RETRY_DELAY_MS = 5000;
+
+  private Type type;
+  private String glacierClass;
+
+  public ITestS3AReadRestoredGlacierObjects(Type type, String glacierClass) {
+    this.type = type;
+    this.glacierClass = glacierClass;
+  }
+
+  private FileSystem createFiles(String s3ObjectStorageClassFilter) throws 
Throwable {
+    Configuration conf = this.createConfiguration();
+    conf.set(READ_RESTORED_GLACIER_OBJECTS, s3ObjectStorageClassFilter);
+    conf.set(STORAGE_CLASS, glacierClass); // Create Glacier objects:Storage 
Class:DEEP_ARCHIVE/GLACIER
+    S3AContract contract = (S3AContract) createContract(conf);
+    contract.init();
+
+    FileSystem fs = contract.getTestFileSystem();
+    Path dir = methodPath();
+    fs.mkdirs(dir);
+    Path path = new Path(dir, "file1");
+    ContractTestUtils.touch(fs, path);
+    return fs;
+  }
+
+  @Override
+  protected Configuration createConfiguration() {
+    Configuration newConf = super.createConfiguration();
+    skipIfStorageClassTestsDisabled(newConf);
+    disableFilesystemCaching(newConf);
+    removeBaseAndBucketOverrides(newConf, STORAGE_CLASS);
+    newConf.set(REJECT_OUT_OF_SPAN_OPERATIONS, "false");

Review Comment:
   or better: create an audit span



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org


Reply via email to