Re: [PR] HADOOP-19102. FooterReadBufferSize should not be greater than readBufferSize [hadoop]

via GitHub Tue, 12 Mar 2024 13:10:58 -0700


steveloughran commented on code in PR #6617:
URL: https://github.com/apache/hadoop/pull/6617#discussion_r1522047711



##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsInputStreamReadFooter.java:
##########
@@ -54,9 +63,44 @@ public class ITestAbfsInputStreamReadFooter extends 
ITestAbfsInputStream {
   private static final int TEN = 10;
   private static final int TWENTY = 20;
 
+  private static ExecutorService executorService;
+
+  private static final int SIZE_256_KB = 256 * ONE_KB;
+
+  private static final Integer[] FILE_SIZES = {

Review Comment:
   This is going to make a slower test on remote runs. Does it really have to 
be this big or is it possible to tune things so that They work with smaller 
files? Because if this is the restriction then it is going to have to become a 
scale test, which will not be run as often.



##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsInputStreamReadFooter.java:
##########
@@ -322,28 +434,52 @@ private void testPartialReadWithNoData(final FileSystem 
fs,
 
   @Test
   public void testPartialReadWithSomeData() throws Exception {
-    for (int i = 0; i <= 4; i++) {
-      for (int j = 0; j <= 2; j++) {
-        int fileSize = (int) Math.pow(2, i) * 256 * ONE_KB;
-        int footerReadBufferSize = (int) Math.pow(2, j) * 256 * ONE_KB;
-        final AzureBlobFileSystem fs = getFileSystem(true,
-            fileSize, footerReadBufferSize);
-        String fileName = methodName.getMethodName() + i;
-        byte[] fileContent = getRandomBytesArray(fileSize);
-        Path testFilePath = createFileWithContent(fs, fileName, fileContent);
-        testPartialReadWithSomeData(fs, testFilePath,
-            fileSize - AbfsInputStream.FOOTER_SIZE, 
AbfsInputStream.FOOTER_SIZE,
-            fileContent, footerReadBufferSize);
+    int fileIdx = 0;
+    List<Future> futureList = new ArrayList<>();
+    for (int fileSize : FILE_SIZES) {
+      final int fileId = fileIdx++;
+      futureList.add(executorService.submit(() -> {
+        try (AzureBlobFileSystem spiedFs = createSpiedFs(
+            getRawConfiguration())) {
+          String fileName = methodName.getMethodName() + fileId;
+          byte[] fileContent = getRandomBytesArray(fileSize);
+          Path testFilePath = createFileWithContent(spiedFs, fileName,
+              fileContent);
+          testParialReadWithSomeData(spiedFs, fileSize, testFilePath,
+              fileContent);
+        } catch (Exception ex) {
+          throw new RuntimeException(ex);
+        }
+      }));
+    }
+    for (Future future : futureList) {
+      future.get();
+    }
+  }
+
+  private void testParialReadWithSomeData(final AzureBlobFileSystem spiedFs,

Review Comment:
   nit: typo



##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsInputStreamReadFooter.java:
##########
@@ -71,22 +115,40 @@ public void 
testMultipleServerCallsAreMadeWhenTheConfIsFalse()
   private void testNumBackendCalls(boolean optimizeFooterRead)
       throws Exception {
     int fileIdx = 0;
-    for (int i = 0; i <= 4; i++) {
-      for (int j = 0; j <= 2; j++) {
-        int fileSize = (int) Math.pow(2, i) * 256 * ONE_KB;
-        int footerReadBufferSize = (int) Math.pow(2, j) * 256 * ONE_KB;
-        final AzureBlobFileSystem fs = getFileSystem(
-            optimizeFooterRead, fileSize);
-        Path testFilePath = createPathAndFileWithContent(
-            fs, fileIdx++, fileSize);
+    final List<Future> futureList = new ArrayList<>();
+    for (int fileSize : FILE_SIZES) {
+      final int fileId = fileIdx++;
+      Future future = executorService.submit(() -> {
+        try (AzureBlobFileSystem spiedFs = createSpiedFs(
+            getRawConfiguration())) {
+          Path testPath = createPathAndFileWithContent(
+              spiedFs, fileId, fileSize);
+          testNumBackendCalls(spiedFs, optimizeFooterRead, fileSize,
+              testPath);
+        } catch (Exception ex) {
+          throw new RuntimeException(ex);
+        }
+      });
+      futureList.add(future);
+    }
+    for (Future future : futureList) {

Review Comment:
   I'm going to suggest that in org.apache.hadoop.util.functional.FutureIO you 
add a new awaitFutures(Collection) method, which iterates through the 
collection and calls awaitFuture on each. And yes, you should be passing down a 
timeout, as when Junit times out It is less informative.



##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsInputStreamReadFooter.java:
##########
@@ -322,28 +434,52 @@ private void testPartialReadWithNoData(final FileSystem 
fs,
 
   @Test
   public void testPartialReadWithSomeData() throws Exception {
-    for (int i = 0; i <= 4; i++) {
-      for (int j = 0; j <= 2; j++) {
-        int fileSize = (int) Math.pow(2, i) * 256 * ONE_KB;
-        int footerReadBufferSize = (int) Math.pow(2, j) * 256 * ONE_KB;
-        final AzureBlobFileSystem fs = getFileSystem(true,
-            fileSize, footerReadBufferSize);
-        String fileName = methodName.getMethodName() + i;
-        byte[] fileContent = getRandomBytesArray(fileSize);
-        Path testFilePath = createFileWithContent(fs, fileName, fileContent);
-        testPartialReadWithSomeData(fs, testFilePath,
-            fileSize - AbfsInputStream.FOOTER_SIZE, 
AbfsInputStream.FOOTER_SIZE,
-            fileContent, footerReadBufferSize);
+    int fileIdx = 0;
+    List<Future> futureList = new ArrayList<>();
+    for (int fileSize : FILE_SIZES) {
+      final int fileId = fileIdx++;
+      futureList.add(executorService.submit(() -> {
+        try (AzureBlobFileSystem spiedFs = createSpiedFs(
+            getRawConfiguration())) {
+          String fileName = methodName.getMethodName() + fileId;
+          byte[] fileContent = getRandomBytesArray(fileSize);
+          Path testFilePath = createFileWithContent(spiedFs, fileName,
+              fileContent);
+          testParialReadWithSomeData(spiedFs, fileSize, testFilePath,
+              fileContent);
+        } catch (Exception ex) {
+          throw new RuntimeException(ex);
+        }
+      }));
+    }
+    for (Future future : futureList) {
+      future.get();
+    }
+  }
+
+  private void testParialReadWithSomeData(final AzureBlobFileSystem spiedFs,
+      final int fileSize, final Path testFilePath, final byte[] fileContent)
+      throws IOException {
+    for (int readBufferSize : READ_BUFFER_SIZE) {
+      for (int footerReadBufferSize : FOOTER_READ_BUFFER_SIZE) {
+        changeFooterConfigs(spiedFs, true,
+            fileSize, footerReadBufferSize, readBufferSize);
+
+        testPartialReadWithSomeData(spiedFs, testFilePath,
+            fileSize - AbfsInputStream.FOOTER_SIZE,
+            AbfsInputStream.FOOTER_SIZE,
+            fileContent, footerReadBufferSize, readBufferSize);
       }
     }
   }
 
   private void testPartialReadWithSomeData(final FileSystem fs,

Review Comment:
   if this is a junit entry point, add @Test and leave the name alone. 
Otherwise to avoid confusion give it a different name such as 
validatePartialReadWithSomeData



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Re: [PR] HADOOP-19102. FooterReadBufferSize should not be greater than readBufferSize [hadoop]

Reply via email to