[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip

2021-03-05 Thread GitBox


sumangala-patki commented on a change in pull request #2698:
URL: https://github.com/apache/hadoop/pull/2698#discussion_r588169126



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java
##
@@ -542,7 +542,7 @@ public synchronized void seek(long n) throws IOException {
 if (n < 0) {
   throw new EOFException(FSExceptionMessages.NEGATIVE_SEEK);
 }
-if (n > contentLength) {
+if (n > 0 && n >= contentLength) {
   throw new EOFException(FSExceptionMessages.CANNOT_SEEK_PAST_EOF);

Review comment:
   seek(n=0) is allowed; n>0 has to be specified so as to avoid throwing 
exception on seek(0) in a 0-byte file





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip

2021-02-24 Thread GitBox


sumangala-patki commented on a change in pull request #2698:
URL: https://github.com/apache/hadoop/pull/2698#discussion_r582563511



##
File path: 
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsInputStreamStatistics.java
##
@@ -100,28 +100,31 @@ public void testSeekStatistics() throws IOException {
 AbfsOutputStream out = null;
 AbfsInputStream in = null;
 
+int readBufferSize = getConfiguration().getReadBufferSize();

Review comment:
   modified to use original buffer and reduced readBufferSize





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip

2021-02-24 Thread GitBox


sumangala-patki commented on a change in pull request #2698:
URL: https://github.com/apache/hadoop/pull/2698#discussion_r582563181



##
File path: 
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRandomRead.java
##
@@ -402,6 +400,18 @@ public void testSkipAndAvailableAndPosition() throws 
Exception {
   inputStream.getPos());
   assertEquals(testFileLength - inputStream.getPos(),
   inputStream.available());
+
+  skipped = inputStream.skip(testFileLength + 1); //goes to last byte
+  assertEquals(1, inputStream.available());
+  bytesRead = inputStream.read(buffer);
+  assertEquals(1, bytesRead);
+  assertEquals(testFileLength, inputStream.getPos());

Review comment:
   getPos() will return contentlength post a read to (incl) EOF; however, 
seek/skip to invalid position is not supported, so getPos after any of these 
ops will return valid position





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip

2021-02-23 Thread GitBox


sumangala-patki commented on a change in pull request #2698:
URL: https://github.com/apache/hadoop/pull/2698#discussion_r581200495



##
File path: 
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRandomRead.java
##
@@ -405,6 +418,27 @@ public void testSkipAndAvailableAndPosition() throws 
Exception {
 }
   }
 
+  @Test
+  public void testZeroByteFile() throws Exception {
+Path emptyFile = new Path("/emptyFile");
+getFileSystem().create(emptyFile);
+FSDataInputStream in = getFileSystem().open(emptyFile);
+assertEquals("Initial position of inputstream in empty file is 0", 0,
+in.getPos());
+in.seek(0);
+assertEquals("Seek to 0 should succeed", 0, in.getPos());
+in.skip(0);

Review comment:
   added assertion

##
File path: 
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRandomRead.java
##
@@ -203,23 +203,36 @@ public void testSkipBounds() throws Exception {
 
   assertTrue(testFileLength > 0);
 
-  skipped = inputStream.skip(testFileLength);
-  assertEquals(testFileLength, skipped);
+  //test skip to EOF with correct input skip count
+  assertEquals("Position should be 0", 0, inputStream.getPos());
+  inputStream.skip(testFileLength - 1);
+  assertEquals("Position should be EOF", testFileLength - 1,
+  inputStream.getPos());
 
-  intercept(EOFException.class,
-  new Callable() {
-@Override
-public Long call() throws Exception {
-  return inputStream.skip(1);
-}
-  }
-  );
   long elapsedTimeMs = timer.elapsedTimeMs();
   assertTrue(
-  String.format(
-  "There should not be any network I/O 
(elapsedTimeMs=%1$d).",
-  elapsedTimeMs),
-  elapsedTimeMs < MAX_ELAPSEDTIMEMS);
+  String.format(
+  "There should not be any network I/O (elapsedTimeMs=%1$d).",
+  elapsedTimeMs),
+  elapsedTimeMs < MAX_ELAPSEDTIMEMS);
+
+  //test negative skip from last valid position
+  skipped = inputStream.skip(-testFileLength+1);

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip

2021-02-23 Thread GitBox


sumangala-patki commented on a change in pull request #2698:
URL: https://github.com/apache/hadoop/pull/2698#discussion_r581199945



##
File path: 
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsInputStreamStatistics.java
##
@@ -100,28 +100,31 @@ public void testSeekStatistics() throws IOException {
 AbfsOutputStream out = null;
 AbfsInputStream in = null;
 
+int readBufferSize = getConfiguration().getReadBufferSize();
+byte[] buf = new byte[readBufferSize + 1];

Review comment:
   To allow seek to position readBufferSize
   Example: readBufferSize = 4, then buf size = 5 => 5 bytes are written to 
file and valid indices in file are [0 1 2 3 4]
   Read call will read readBufferSize bytes, i.e., 4 bytes (up to and incl 
position 3 in file) so fcursor is now at position 4.
   To be counted as a forward seek, any subsequent seek has to be to a position 
>= current fcursor (4). Hence, the file should have at least 4 + 1 = 5 bytes 
for seek(4) to be allowed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip

2021-02-18 Thread GitBox


sumangala-patki commented on a change in pull request #2698:
URL: https://github.com/apache/hadoop/pull/2698#discussion_r578609371



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java
##
@@ -583,8 +583,8 @@ public synchronized long skip(long n) throws IOException {
   newPos = 0;
   n = newPos - currentPos;
 }
-if (newPos > contentLength) {
-  newPos = contentLength;
+if (newPos >= contentLength) {

Review comment:
   skip(0) should seek to 0, corrected and added test





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2698: HADOOP-17527. ABFS: Fix boundary conditions in InputStream seek and skip

2021-02-18 Thread GitBox


sumangala-patki commented on a change in pull request #2698:
URL: https://github.com/apache/hadoop/pull/2698#discussion_r578608579



##
File path: 
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRandomRead.java
##
@@ -402,6 +399,18 @@ public void testSkipAndAvailableAndPosition() throws 
Exception {
   inputStream.getPos());
   assertEquals(testFileLength - inputStream.getPos(),
   inputStream.available());
+
+  skipped = inputStream.skip(testFileLength + 1); //goes to last byte
+  assertEquals(1, inputStream.available());

Review comment:
   added





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org