Github user ConeyLiu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20026#discussion_r158243377
  
    --- Diff: 
core/src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java ---
    @@ -61,6 +61,7 @@ private boolean refill() throws IOException {
             nRead = fileChannel.read(byteBuffer);
           }
           if (nRead < 0) {
    +        byteBuffer.flip();
    --- End diff --
    
    This related to this error: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85228/testReport/org.apache.spark.storage/BlockManagerSuite/LRU_with_mixed_storage_levels_and_streams__encryption___on_/
    I'm not very sure the reason, but I guess this happens such as follow:
    ```scala
    var i = 0
    while (i < inputStream.avaiable()) {
         //do something
    }
    ```
    After we arrived at the end of the file which `i == (inputStream.avaiable() 
- 1)`, then we get `-1` from `inputStream.read()`. And this time we need to 
call the `refill()` too. Even if we can't get the data from the underlying 
`fileChannel`, but the `byteBuffer` flipped. So the `inputStream.avaiable` 
changed, and we still can read the dirty data remained in the `byteBuffer`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to