[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip
sumangala-patki commented on a change in pull request #2698: URL: https://github.com/apache/hadoop/pull/2698#discussion_r588169126 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java ## @@ -542,7 +542,7 @@ public synchronized void seek(long n) throws IOException { if (n < 0) { throw new EOFException(FSExceptionMessages.NEGATIVE_SEEK); } -if (n > contentLength) { +if (n > 0 && n >= contentLength) { throw new EOFException(FSExceptionMessages.CANNOT_SEEK_PAST_EOF); Review comment: seek(n=0) is allowed; n>0 has to be specified so as to avoid throwing exception on seek(0) in a 0-byte file This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip
sumangala-patki commented on a change in pull request #2698: URL: https://github.com/apache/hadoop/pull/2698#discussion_r582563511 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsInputStreamStatistics.java ## @@ -100,28 +100,31 @@ public void testSeekStatistics() throws IOException { AbfsOutputStream out = null; AbfsInputStream in = null; +int readBufferSize = getConfiguration().getReadBufferSize(); Review comment: modified to use original buffer and reduced readBufferSize This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip
sumangala-patki commented on a change in pull request #2698: URL: https://github.com/apache/hadoop/pull/2698#discussion_r582563181 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRandomRead.java ## @@ -402,6 +400,18 @@ public void testSkipAndAvailableAndPosition() throws Exception { inputStream.getPos()); assertEquals(testFileLength - inputStream.getPos(), inputStream.available()); + + skipped = inputStream.skip(testFileLength + 1); //goes to last byte + assertEquals(1, inputStream.available()); + bytesRead = inputStream.read(buffer); + assertEquals(1, bytesRead); + assertEquals(testFileLength, inputStream.getPos()); Review comment: getPos() will return contentlength post a read to (incl) EOF; however, seek/skip to invalid position is not supported, so getPos after any of these ops will return valid position This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip
sumangala-patki commented on a change in pull request #2698: URL: https://github.com/apache/hadoop/pull/2698#discussion_r581200495 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRandomRead.java ## @@ -405,6 +418,27 @@ public void testSkipAndAvailableAndPosition() throws Exception { } } + @Test + public void testZeroByteFile() throws Exception { +Path emptyFile = new Path("/emptyFile"); +getFileSystem().create(emptyFile); +FSDataInputStream in = getFileSystem().open(emptyFile); +assertEquals("Initial position of inputstream in empty file is 0", 0, +in.getPos()); +in.seek(0); +assertEquals("Seek to 0 should succeed", 0, in.getPos()); +in.skip(0); Review comment: added assertion ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRandomRead.java ## @@ -203,23 +203,36 @@ public void testSkipBounds() throws Exception { assertTrue(testFileLength > 0); - skipped = inputStream.skip(testFileLength); - assertEquals(testFileLength, skipped); + //test skip to EOF with correct input skip count + assertEquals("Position should be 0", 0, inputStream.getPos()); + inputStream.skip(testFileLength - 1); + assertEquals("Position should be EOF", testFileLength - 1, + inputStream.getPos()); - intercept(EOFException.class, - new Callable() { -@Override -public Long call() throws Exception { - return inputStream.skip(1); -} - } - ); long elapsedTimeMs = timer.elapsedTimeMs(); assertTrue( - String.format( - "There should not be any network I/O (elapsedTimeMs=%1$d).", - elapsedTimeMs), - elapsedTimeMs < MAX_ELAPSEDTIMEMS); + String.format( + "There should not be any network I/O (elapsedTimeMs=%1$d).", + elapsedTimeMs), + elapsedTimeMs < MAX_ELAPSEDTIMEMS); + + //test negative skip from last valid position + skipped = inputStream.skip(-testFileLength+1); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip
sumangala-patki commented on a change in pull request #2698: URL: https://github.com/apache/hadoop/pull/2698#discussion_r581199945 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsInputStreamStatistics.java ## @@ -100,28 +100,31 @@ public void testSeekStatistics() throws IOException { AbfsOutputStream out = null; AbfsInputStream in = null; +int readBufferSize = getConfiguration().getReadBufferSize(); +byte[] buf = new byte[readBufferSize + 1]; Review comment: To allow seek to position readBufferSize Example: readBufferSize = 4, then buf size = 5 => 5 bytes are written to file and valid indices in file are [0 1 2 3 4] Read call will read readBufferSize bytes, i.e., 4 bytes (up to and incl position 3 in file) so fcursor is now at position 4. To be counted as a forward seek, any subsequent seek has to be to a position >= current fcursor (4). Hence, the file should have at least 4 + 1 = 5 bytes for seek(4) to be allowed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip
sumangala-patki commented on a change in pull request #2698: URL: https://github.com/apache/hadoop/pull/2698#discussion_r578609371 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java ## @@ -583,8 +583,8 @@ public synchronized long skip(long n) throws IOException { newPos = 0; n = newPos - currentPos; } -if (newPos > contentLength) { - newPos = contentLength; +if (newPos >= contentLength) { Review comment: skip(0) should seek to 0, corrected and added test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip
sumangala-patki commented on a change in pull request #2698: URL: https://github.com/apache/hadoop/pull/2698#discussion_r578608579 ## File path: hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRandomRead.java ## @@ -402,6 +399,18 @@ public void testSkipAndAvailableAndPosition() throws Exception { inputStream.getPos()); assertEquals(testFileLength - inputStream.getPos(), inputStream.available()); + + skipped = inputStream.skip(testFileLength + 1); //goes to last byte + assertEquals(1, inputStream.available()); Review comment: added This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org