[GitHub] [hadoop] dannycjones commented on a diff in pull request #4205: HADOOP-18177. Document prefetching architecture.

2022-04-26 Thread GitBox


dannycjones commented on code in PR #4205:
URL: https://github.com/apache/hadoop/pull/4205#discussion_r858597582


##
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##
@@ -0,0 +1,192 @@
+
+
+# S3A Prefetching
+
+This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
+
+This input stream implements prefetching and caching to improve read 
performance of the input
+stream.
+A high level overview of this feature was published in
+[Pinterest Engineering's blog post titled "Improving efficiency and reducing 
runtime using S3 read 
optimization"](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0).
+
+With prefetching, the input stream divides the remote file into blocks of a 
fixed size, associates
+buffers to these blocks and then reads data into these buffers asynchronously. 
+It also potentially caches these blocks.
+
+### Basic Concepts
+
+* **Remote File**: A binary blob of data stored on some storage device.
+* **Block File**: Local file containing a block of the remote file.
+* **Block**: A file is divided into a number of blocks. 
+The size of the first n-1 blocks is same, and the size of the last block may 
be same or smaller.
+* **Block based reading**: The granularity of read is one block. 
+That is, either an entire block is read and returned or none at all. 
+Multiple blocks may be read in parallel.
+
+### Configuring the stream
+
+|Property|Meaning|Default|
+|---   |---|---|
+|`fs.s3a.prefetch.enabled`|Enable the prefetch input stream|`true` |
+|`fs.s3a.prefetch.block.size`|Size of a block|`8M`|
+|`fs.s3a.prefetch.block.count`|Number of blocks to prefetch|`8`|
+
+### Key Components
+
+`S3PrefetchingInputStream` - When prefetching is enabled, S3AFileSystem will 
return an instance of
+this class as the input stream. 
+Depending on the remote file size, it will either use
+the `S3InMemoryInputStream` or the `S3CachingInputStream` as the underlying 
input stream.
+
+`S3InMemoryInputStream` - Underlying input stream used when the remote file 
size < configured block
+size. 
+Will read the entire remote file into memory.
+
+`S3CachingInputStream` - Underlying input stream used when remote file size > 
configured block size.
+Uses asynchronous prefetching of blocks and caching to improve performance.
+
+`BlockData` - Holds information about the blocks in a remote file, such as:
+
+* Number of blocks in the remote file
+* Block size
+* State of each block (initially all blocks have state *NOT_READY*). 
+Other states are: Queued, Ready, Cached.
+
+`BufferData` - Holds the buffer and additional information about it such as:
+
+* The block number this buffer is for
+* State of the buffer (Unknown, Blank, Prefetching, Caching, Ready, Done). 
+Initial state of a buffer is blank.
+
+`CachingBlockManager` - Implements reading data into the buffer, prefetching 
and caching.
+
+`BufferPool` - Manages a fixed sized pool of buffers. 
+It’s used by `CachingBlockManager` to acquire buffers.
+
+`S3File` - Implements operations to interact with S3 such as opening and 
closing the input stream to
+the remote file in S3.
+
+`S3Reader` - Implements reading from the stream opened by `S3File`. 
+Reads from this input stream in blocks of 64KB.
+
+`FilePosition` - Provides functionality related to tracking the position in 
the file. 
+Also gives access to the current buffer in use.
+
+`SingleFilePerBlockCache` - Responsible for caching blocks to the local file 
system. 
+Each cache block is stored on the local disk as a separate block file.
+
+### Operation
+
+ S3InMemoryInputStream
+
+For a remote file with size 5MB, and block size = 8MB, since file size is less 
than the block size,
+the `S3InMemoryInputStream` will be used.
+
+If the caller makes the following read calls:
+
+```
+in.read(buffer, 0, 3MB);
+in.read(buffer, 0, 2MB);
+```
+
+When the first read is issued, there is no buffer in use yet. 
+The `S3InMemoryInputStream` gets the data in this remote file by calling the 
`ensureCurrentBuffer()` 
+method, which ensures that a buffer with data is available to be read from.
+
+The `ensureCurrentBuffer()` then:
+
+* Reads data into a buffer by calling `S3Reader.read(ByteBuffer buffer, long 
offset, int size)`.
+* `S3Reader` uses `S3File` to open an input stream to the remote file in S3 by 
making
+  a `getObject()` request with range as `(0, filesize)`.
+* The `S3Reader` reads the entire remote file into the provided buffer, and 
once reading is complete
+  closes the S3 stream and frees all underlying resources.
+* Now the entire remote file is in a buffer, set this data in `FilePosition` 
so it can be accessed
+  by the input stream.
+
+The read operation now just gets the required bytes from the buffer in 
`FilePosition`.
+
+When the second read is issued, there is already a valid buffer which can be 
used. 
+Do

[GitHub] [hadoop] dannycjones commented on a diff in pull request #4205: HADOOP-18177. Document prefetching architecture.

2022-04-26 Thread GitBox


dannycjones commented on code in PR #4205:
URL: https://github.com/apache/hadoop/pull/4205#discussion_r858455675


##
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##
@@ -14,94 +14,114 @@
 
 # S3A Prefetching
 
-
 This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
 
-This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+This input stream implements prefetching and caching to improve read 
performance of the input
+stream. A high level overview of this feature was published in
+[Pinterest Engineering's blog post titled "Improving efficiency and reducing 
runtime using S3 read 
optimization"](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
+blogpost.
 
-With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+With prefetching, the input stream divides the remote file into blocks of a 
fixed size, associates
+buffers to these blocks and then reads data into these buffers asynchronously. 
It also potentially
+caches these blocks.
 
 ### Basic Concepts
 
-* **File** : A binary blob of data stored on some storage device.
-* **Block :** A file is divided into a number of blocks. The default size of a 
block is 8MB, but can be configured. The size of the first n-1 blocks is same,  
and the size of the last block may be same or smaller.
-* **Block based reading** : The granularity of read is one block. That is, we 
read an entire block and return or none at all. Multiple blocks may be read in 
parallel.
+* **Remote File**: A binary blob of data stored on some storage device.
+* **Block**: A file is divided into a number of blocks. The size of the first 
n-1 blocks is same,
+  and the size of the last block may be same or smaller.
+* **Block based reading**: The granularity of read is one block. That is, 
either an entire block is
+  read and returned or none at all. Multiple blocks may be read in parallel.
 
 ### Configuring the stream
 
 |Property|Meaning|Default|
 |---   |---|---|
-|fs.s3a.prefetch.enabled|Enable the prefetch input stream|TRUE |
-|fs.s3a.prefetch.block.size|Size of a block|8MB|
-|fs.s3a.prefetch.block.count|Number of blocks to prefetch|8|
+|fs.s3a.prefetch.enabled|Enable the prefetch input stream|`true` |
+|fs.s3a.prefetch.block.size|Size of a block|`8M`|
+|fs.s3a.prefetch.block.count|Number of blocks to prefetch|`8`|

Review Comment:
   We want backticks around the configuration keys too.
   
   ```suggestion
   |`fs.s3a.prefetch.enabled`|Enable the prefetch input stream|`true` |
   |`fs.s3a.prefetch.block.size`|Size of a block|`8M`|
   |`fs.s3a.prefetch.block.count`|Number of blocks to prefetch|`8`|
   ```



##
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##
@@ -14,94 +14,114 @@
 
 # S3A Prefetching
 
-
 This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
 
-This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+This input stream implements prefetching and caching to improve read 
performance of the input
+stream. A high level overview of this feature was published in
+[Pinterest Engineering's blog post titled "Improving efficiency and reducing 
runtime using S3 read 
optimization"](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
+blogpost.

Review Comment:
   drop `blogpost` as it is already mentioned earlier



##
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##
@@ -14,94 +14,114 @@
 
 # S3A Prefetching
 
-
 This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
 
-This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+This input stream implements prefetching and caching to improve read 
performance of the input
+stream. A high level overview of this feature was published in
+[Pinter

[GitHub] [hadoop] dannycjones commented on a diff in pull request #4205: HADOOP-18177. Document prefetching architecture.

2022-04-21 Thread GitBox


dannycjones commented on code in PR #4205:
URL: https://github.com/apache/hadoop/pull/4205#discussion_r854934280


##
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##
@@ -0,0 +1,151 @@
+
+
+# S3A Prefetching
+
+
+This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
+
+This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.

Review Comment:
   Let's put the title in the link somehow for screenreaders.
   
   ```suggestion
   This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature was 
published in [Pinterest Engineering's blog post titled "Improving efficiency 
and reducing runtime using S3 read 
optimization"](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0).
   ```



##
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##
@@ -0,0 +1,151 @@
+
+
+# S3A Prefetching
+
+
+This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
+
+This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+
+With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+
+### Basic Concepts
+
+* **File** : A binary blob of data stored on some storage device.
+* **Block :** A file is divided into a number of blocks. The default size of a 
block is 8MB, but can be configured. The size of the first n-1 blocks is same,  
and the size of the last block may be same or smaller.
+* **Block based reading** : The granularity of read is one block. That is, we 
read an entire block and return or none at all. Multiple blocks may be read in 
parallel.

Review Comment:
   colon is inside the bold here but not for others
   
   ```suggestion
   * **File**: A binary blob of data stored on some storage device.
   * **Block**: A file is divided into a number of blocks. The default size of 
a block is 8MB, but can be configured. The size of the first n-1 blocks is 
same,  and the size of the last block may be same or smaller.
   * **Block based reading**: The granularity of read is one block. That is, we 
read an entire block and return or none at all. Multiple blocks may be read in 
parallel.
   ```



##
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##
@@ -0,0 +1,151 @@
+
+
+# S3A Prefetching
+
+
+This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
+
+This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+
+With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+
+### Basic Concepts
+
+* **File** : A binary blob of data stored on some storage device.
+* **Block :** A file is divided into a number of blocks. The default size of a 
block is 8MB, but can be configured. The size of the first n-1 blocks is same,  
and the size of the last block may be same or smaller.
+* **Block based reading** : The granularity of read is one block. That is, we 
read an entire block and return or none at all. Multiple blocks may be read in 
parallel.
+
+### Configuring the stream
+
+|Property|Meaning|Default|
+|---   |---|---|
+|fs.s3a.prefetch.enabled|Enable the prefetch input stream|TRUE |
+|fs.s3a.prefetch.block.size|Size of a block|8MB|
+|fs.s3a.prefetch.block.count|Number of blocks to prefetch|8|
+
+### Key Components:
+
+`S3PrefetchingInputStream` - When prefetching is enabled, S3AFileSystem will 
return an instance of this class as the input stream. Depending on the file 
size, it will either use the `S3InMemoryInputStream` or the 
`S3CachingInputStream` as the underlying input stream.
+
+`S3InMemoryInputStream` - Underlying input stream used when the file size < 
configured block size. Will read the entire file into memory.
+
+`S3CachingInputStream` - Underlying input s