[GitHub] [hadoop] steveloughran commented on a diff in pull request #4205: HADOOP-18177. Document prefetching architecture.

GitBox Fri, 22 Apr 2022 09:58:13 -0700


steveloughran commented on code in PR #4205:
URL: https://github.com/apache/hadoop/pull/4205#discussion_r856383998



##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -0,0 +1,151 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# S3A Prefetching
+
+
+This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
+
+This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+
+With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+
+### Basic Concepts
+
+* **File** : A binary blob of data stored on some storage device.
+* **Block :** A file is divided into a number of blocks. The default size of a 
block is 8MB, but can be configured. The size of the first n-1 blocks is same,  
and the size of the last block may be same or smaller.
+* **Block based reading** : The granularity of read is one block. That is, we 
read an entire block and return or none at all. Multiple blocks may be read in 
parallel.
+
+### Configuring the stream
+
+|Property    |Meaning    |Default    |
+|---   |---    |---    |
+|fs.s3a.prefetch.enabled    |Enable the prefetch input stream    |TRUE |

Review Comment:
   1. use backticks around the configuration names and values; all values must 
be valid if passed in as the config strings. they will be.
   2. use `true` for true, rather than the capitalised value



##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -0,0 +1,151 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# S3A Prefetching
+
+
+This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
+
+This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+
+With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+

Review Comment:
   can you replace "we" in the docs to refer instead to the class/component 
which is doing the work



##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -0,0 +1,151 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# S3A Prefetching
+
+
+This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
+
+This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+
+With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+
+### Basic Concepts
+
+* **File** : A binary blob of data stored on some storage device.
+* **Block :** A file is divided into a number of blocks. The default size of a 
block is 8MB, but can be configured. The size of the first n-1 blocks is same,  
and the size of the last block may be same or smaller.
+* **Block based reading** : The granularity of read is one block. That is, we 
read an entire block and return or none at all. Multiple blocks may be read in 
parallel.
+
+### Configuring the stream
+
+|Property    |Meaning    |Default    |
+|---   |---    |---    |
+|fs.s3a.prefetch.enabled    |Enable the prefetch input stream    |TRUE |
+|fs.s3a.prefetch.block.size    |Size of a block    |8MB    |
+|fs.s3a.prefetch.block.count    |Number of blocks to prefetch    |8    |
+
+### Key Components:
+
+`S3PrefetchingInputStream` - When prefetching is enabled, S3AFileSystem will 
return an instance of this class as the input stream. Depending on the file 
size, it will either use the `S3InMemoryInputStream` or the 
`S3CachingInputStream` as the underlying input stream.
+
+`S3InMemoryInputStream` - Underlying input stream used when the file size < 
configured block size. Will read the entire file into memory.
+
+`S3CachingInputStream` - Underlying input stream used when file size > 
configured block size. Uses asynchronous prefetching of blocks and caching to 
improve performance.
+
+`BlockData` - Holds information about the blocks in a file, such as:
+
+* Number of blocks in the file
+* Block size
+* State of each block (initially all blocks have state *NOT_READY*). Other 
states are: Queued, Ready, Cached.
+
+`BufferData` - Holds the buffer and additional information about it such as:
+
+* The block number this buffer is for
+* State of the buffer (Unknown, Blank, Prefetching, Caching, Ready, Done). 
Initial state of a buffer is blank.
+
+`CachingBlockManager` - Implements reading data into the buffer, prefetching 
and caching.
+
+`BufferPool` - Manages a fixed sized pool of buffers. It’s used by 
`CachingBlockManager` to acquire buffers.
+
+`S3File` - Implements operations to interact with S3 such as opening and 
closing the input stream to the S3 file.
+
+`S3Reader` - Implements reading from the stream opened by `S3File`. Reads from 
this input stream in blocks of 64KB.
+
+`FilePosition` - Provides functionality related to tracking the position in 
the file. Also gives access to the current buffer in use.
+
+`SingleFilePerBlockCache` - Responsible for caching blocks to the local file 
system. Each cache block is stored on the local disk as a separate file.
+
+### Operation
+
+### S3InMemoryInputStream
+
+If we have a file with size 5MB, and block size = 8MB. Since file size is less 
than the block size, the `S3InMemoryInputStream` will be used.
+
+If the caller makes the following read calls:
+
+
+```
+in.read(buffer, 0, 3MB);
+in.read(buffer, 0, 2MB);
+```
+
+When the first read is issued, there is no buffer in use yet. We get the data 
in this file by calling the `ensureCurrentBuffer()` method, which ensures that 
a buffer with data is available to be read from.
+
+The `ensureCurrentBuffer()` then:
+
+* Reads data into a buffer by calling `S3Reader.read(ByteBuffer buffer, long 
offset, int size)`
+*  `S3Reader` uses `S3File` to open an input stream to the S3 file by making a 
`getObject()` request with range as `(0, filesize)`.
+*  The S3Reader reads the entire file into the provided buffer, and once 
reading is complete closes the S3 stream and frees all underlying resources.
+* Now the entire file is in a buffer, set this data in `FilePosition` so it 
can be accessed by the input stream.
+
+The read operation now just gets the required bytes from the buffer in 
`FilePosition`.
+
+When the second read is issued, there is already a valid buffer which can be 
used. Don’t do anything else, just read the required bytes from this buffer.
+
+### S3CachingInputStream
+
+
+
+[Image: image.png]

Review Comment:
   they go into hadoop-tools/hadoop-aws/src/site/resources/images ; other 
modules have examples of this



##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -0,0 +1,151 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# S3A Prefetching
+
+
+This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
+
+This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+
+With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+
+### Basic Concepts
+
+* **File** : A binary blob of data stored on some storage device.
+* **Block :** A file is divided into a number of blocks. The default size of a 
block is 8MB, but can be configured. The size of the first n-1 blocks is same,  
and the size of the last block may be same or smaller.
+* **Block based reading** : The granularity of read is one block. That is, we 
read an entire block and return or none at all. Multiple blocks may be read in 
parallel.
+
+### Configuring the stream
+
+|Property    |Meaning    |Default    |
+|---   |---    |---    |
+|fs.s3a.prefetch.enabled    |Enable the prefetch input stream    |TRUE |
+|fs.s3a.prefetch.block.size    |Size of a block    |8MB    |
+|fs.s3a.prefetch.block.count    |Number of blocks to prefetch    |8    |
+
+### Key Components:
+
+`S3PrefetchingInputStream` - When prefetching is enabled, S3AFileSystem will 
return an instance of this class as the input stream. Depending on the file 
size, it will either use the `S3InMemoryInputStream` or the 
`S3CachingInputStream` as the underlying input stream.
+
+`S3InMemoryInputStream` - Underlying input stream used when the file size < 
configured block size. Will read the entire file into memory.
+
+`S3CachingInputStream` - Underlying input stream used when file size > 
configured block size. Uses asynchronous prefetching of blocks and caching to 
improve performance.
+
+`BlockData` - Holds information about the blocks in a file, such as:
+
+* Number of blocks in the file
+* Block size
+* State of each block (initially all blocks have state *NOT_READY*). Other 
states are: Queued, Ready, Cached.
+
+`BufferData` - Holds the buffer and additional information about it such as:
+
+* The block number this buffer is for
+* State of the buffer (Unknown, Blank, Prefetching, Caching, Ready, Done). 
Initial state of a buffer is blank.
+
+`CachingBlockManager` - Implements reading data into the buffer, prefetching 
and caching.
+
+`BufferPool` - Manages a fixed sized pool of buffers. It’s used by 
`CachingBlockManager` to acquire buffers.
+
+`S3File` - Implements operations to interact with S3 such as opening and 
closing the input stream to the S3 file.
+
+`S3Reader` - Implements reading from the stream opened by `S3File`. Reads from 
this input stream in blocks of 64KB.
+
+`FilePosition` - Provides functionality related to tracking the position in 
the file. Also gives access to the current buffer in use.
+
+`SingleFilePerBlockCache` - Responsible for caching blocks to the local file 
system. Each cache block is stored on the local disk as a separate file.
+
+### Operation
+
+### S3InMemoryInputStream
+
+If we have a file with size 5MB, and block size = 8MB. Since file size is less 
than the block size, the `S3InMemoryInputStream` will be used.
+
+If the caller makes the following read calls:
+
+
+```
+in.read(buffer, 0, 3MB);
+in.read(buffer, 0, 2MB);
+```
+
+When the first read is issued, there is no buffer in use yet. We get the data 
in this file by calling the `ensureCurrentBuffer()` method, which ensures that 
a buffer with data is available to be read from.
+
+The `ensureCurrentBuffer()` then:
+
+* Reads data into a buffer by calling `S3Reader.read(ByteBuffer buffer, long 
offset, int size)`
+*  `S3Reader` uses `S3File` to open an input stream to the S3 file by making a 
`getObject()` request with range as `(0, filesize)`.
+*  The S3Reader reads the entire file into the provided buffer, and once 
reading is complete closes the S3 stream and frees all underlying resources.
+* Now the entire file is in a buffer, set this data in `FilePosition` so it 
can be accessed by the input stream.
+
+The read operation now just gets the required bytes from the buffer in 
`FilePosition`.
+
+When the second read is issued, there is already a valid buffer which can be 
used. Don’t do anything else, just read the required bytes from this buffer.
+
+### S3CachingInputStream
+
+
+
+[Image: image.png]
+
+Now, if we have a file with size 40MB, and block size = 8MB. The 
`S3CachingInputStream` will be used.
+
+### Sequential Reads
+
+If the caller makes the following calls:
+
+```
+in.read(buffer, 0, 5MB)
+in.read(buffer, 0, 8MB)
+```
+
+For the first read call, there is no valid buffer yet. `ensureCurrentBuffer()` 
is called, and for the first read(), prefetch count is set as 1.

Review Comment:
   add backticks around 'read()'



##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -0,0 +1,151 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# S3A Prefetching
+
+
+This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
+
+This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+
+With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+
+### Basic Concepts
+
+* **File** : A binary blob of data stored on some storage device.

Review Comment:
   should the doc use Object to refer to something in s3, to distinguish it 
from the local FS files?



##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -0,0 +1,151 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# S3A Prefetching
+
+
+This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
+
+This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+
+With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+
+### Basic Concepts
+
+* **File** : A binary blob of data stored on some storage device.
+* **Block :** A file is divided into a number of blocks. The default size of a 
block is 8MB, but can be configured. The size of the first n-1 blocks is same,  
and the size of the last block may be same or smaller.
+* **Block based reading** : The granularity of read is one block. That is, we 
read an entire block and return or none at all. Multiple blocks may be read in 
parallel.
+
+### Configuring the stream
+
+|Property    |Meaning    |Default    |
+|---   |---    |---    |
+|fs.s3a.prefetch.enabled    |Enable the prefetch input stream    |TRUE |
+|fs.s3a.prefetch.block.size    |Size of a block    |8MB    |

Review Comment:
   8MB isn't gong to be a valid value. `8M` should be



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a diff in pull request #4205: HADOOP-18177. Document prefetching architecture.

Reply via email to