[jira] [Work logged] (HADOOP-18177) document use and architecture design of prefetching s3a input stream

ASF GitHub Bot (Jira) Tue, 26 Apr 2022 02:05:07 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-18177?focusedWorklogId=762184&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762184
 ]


ASF GitHub Bot logged work on HADOOP-18177:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 26/Apr/22 09:04
            Start Date: 26/Apr/22 09:04
    Worklog Time Spent: 10m 
      Work Description: dannycjones commented on code in PR #4205:
URL: https://github.com/apache/hadoop/pull/4205#discussion_r858455675


##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -14,94 +14,114 @@
 
 # S3A Prefetching
 
-
 This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
 
-This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+This input stream implements prefetching and caching to improve read 
performance of the input
+stream. A high level overview of this feature was published in
+[Pinterest Engineering's blog post titled "Improving efficiency and reducing 
runtime using S3 read 
optimization"](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
+blogpost.
 
-With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+With prefetching, the input stream divides the remote file into blocks of a 
fixed size, associates
+buffers to these blocks and then reads data into these buffers asynchronously. 
It also potentially
+caches these blocks.
 
 ### Basic Concepts
 
-* **File** : A binary blob of data stored on some storage device.
-* **Block :** A file is divided into a number of blocks. The default size of a 
block is 8MB, but can be configured. The size of the first n-1 blocks is same,  
and the size of the last block may be same or smaller.
-* **Block based reading** : The granularity of read is one block. That is, we 
read an entire block and return or none at all. Multiple blocks may be read in 
parallel.
+* **Remote File**: A binary blob of data stored on some storage device.
+* **Block**: A file is divided into a number of blocks. The size of the first 
n-1 blocks is same,
+  and the size of the last block may be same or smaller.
+* **Block based reading**: The granularity of read is one block. That is, 
either an entire block is
+  read and returned or none at all. Multiple blocks may be read in parallel.
 
 ### Configuring the stream
 
 |Property    |Meaning    |Default    |
 |---   |---    |---    |
-|fs.s3a.prefetch.enabled    |Enable the prefetch input stream    |TRUE |
-|fs.s3a.prefetch.block.size    |Size of a block    |8MB    |
-|fs.s3a.prefetch.block.count    |Number of blocks to prefetch    |8    |
+|fs.s3a.prefetch.enabled    |Enable the prefetch input stream    |`true` |
+|fs.s3a.prefetch.block.size    |Size of a block    |`8M`    |
+|fs.s3a.prefetch.block.count    |Number of blocks to prefetch    |`8`    |

Review Comment:
   We want backticks around the configuration keys too.
   
   ```suggestion
   |`fs.s3a.prefetch.enabled`    |Enable the prefetch input stream    |`true` |
   |`fs.s3a.prefetch.block.size`    |Size of a block    |`8M`    |
   |`fs.s3a.prefetch.block.count`    |Number of blocks to prefetch    |`8`    |
   ```



##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -14,94 +14,114 @@
 
 # S3A Prefetching
 
-
 This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
 
-This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+This input stream implements prefetching and caching to improve read 
performance of the input
+stream. A high level overview of this feature was published in
+[Pinterest Engineering's blog post titled "Improving efficiency and reducing 
runtime using S3 read 
optimization"](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
+blogpost.

Review Comment:
   drop `blogpost` as it is already mentioned earlier



##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -14,94 +14,114 @@
 
 # S3A Prefetching
 
-
 This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
 
-This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+This input stream implements prefetching and caching to improve read 
performance of the input
+stream. A high level overview of this feature was published in
+[Pinterest Engineering's blog post titled "Improving efficiency and reducing 
runtime using S3 read 
optimization"](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
+blogpost.
 
-With prefetching, we divide the file into blocks of a fixed size (default is 
8MB), associate buffers to these blocks, and then read data into these buffers 
asynchronously. We also potentially cache these blocks.
+With prefetching, the input stream divides the remote file into blocks of a 
fixed size, associates
+buffers to these blocks and then reads data into these buffers asynchronously. 
It also potentially
+caches these blocks.
 
 ### Basic Concepts
 
-* **File** : A binary blob of data stored on some storage device.
-* **Block :** A file is divided into a number of blocks. The default size of a 
block is 8MB, but can be configured. The size of the first n-1 blocks is same,  
and the size of the last block may be same or smaller.
-* **Block based reading** : The granularity of read is one block. That is, we 
read an entire block and return or none at all. Multiple blocks may be read in 
parallel.
+* **Remote File**: A binary blob of data stored on some storage device.
+* **Block**: A file is divided into a number of blocks. The size of the first 
n-1 blocks is same,
+  and the size of the last block may be same or smaller.
+* **Block based reading**: The granularity of read is one block. That is, 
either an entire block is
+  read and returned or none at all. Multiple blocks may be read in parallel.
 
 ### Configuring the stream
 
 |Property    |Meaning    |Default    |
 |---   |---    |---    |
-|fs.s3a.prefetch.enabled    |Enable the prefetch input stream    |TRUE |
-|fs.s3a.prefetch.block.size    |Size of a block    |8MB    |
-|fs.s3a.prefetch.block.count    |Number of blocks to prefetch    |8    |
+|fs.s3a.prefetch.enabled    |Enable the prefetch input stream    |`true` |
+|fs.s3a.prefetch.block.size    |Size of a block    |`8M`    |
+|fs.s3a.prefetch.block.count    |Number of blocks to prefetch    |`8`    |
 
-### Key Components:
+### Key Components
 
-`S3PrefetchingInputStream` - When prefetching is enabled, S3AFileSystem will 
return an instance of this class as the input stream. Depending on the file 
size, it will either use the `S3InMemoryInputStream` or the 
`S3CachingInputStream` as the underlying input stream.
+`S3PrefetchingInputStream` - When prefetching is enabled, S3AFileSystem will 
return an instance of
+this class as the input stream. Depending on the remote file size, it will 
either use
+the `S3InMemoryInputStream` or the `S3CachingInputStream` as the underlying 
input stream.
 
-`S3InMemoryInputStream` - Underlying input stream used when the file size < 
configured block size. Will read the entire file into memory.
+`S3InMemoryInputStream` - Underlying input stream used when the remote file 
size < configured block
+size. Will read the entire remote file into memory.
 
-`S3CachingInputStream` - Underlying input stream used when file size > 
configured block size. Uses asynchronous prefetching of blocks and caching to 
improve performance.
+`S3CachingInputStream` - Underlying input stream used when remote file size > 
configured block size.
+Uses asynchronous prefetching of blocks and caching to improve performance.
 
-`BlockData` - Holds information about the blocks in a file, such as:
+`BlockData` - Holds information about the blocks in a remote file, such as:
 
-* Number of blocks in the file
+* Number of blocks in the remote file
 * Block size
-* State of each block (initially all blocks have state *NOT_READY*). Other 
states are: Queued, Ready, Cached.
+* State of each block (initially all blocks have state *NOT_READY*). Other 
states are: Queued,
+  Ready, Cached.
 
 `BufferData` - Holds the buffer and additional information about it such as:
 
 * The block number this buffer is for
-* State of the buffer (Unknown, Blank, Prefetching, Caching, Ready, Done). 
Initial state of a buffer is blank.
+* State of the buffer (Unknown, Blank, Prefetching, Caching, Ready, Done). 
Initial state of a buffer
+  is blank.

Review Comment:
   As an example for line breaks, we might instead have something like...
   
   ```suggestion
   * State of the buffer (Unknown, Blank, Prefetching, Caching, Ready, Done).
     Initial state of a buffer is blank.
   ```
   
   This way, we can comment or edit individual sentences - and also avoid 
breaking this sentence in half.



##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -110,29 +130,42 @@ in.read(buffer, 0, 5MB)
 in.read(buffer, 0, 8MB)
 ```
 
-For the first read call, there is no valid buffer yet. `ensureCurrentBuffer()` 
is called, and for the first read(), prefetch count is set as 1.
-
-The current block (block 0) is read synchronously, while the blocks to be 
prefetched (block 1) is read asynchronously.
+For the first read call, there is no valid buffer yet. `ensureCurrentBuffer()` 
is called, and for
+the first `read()`, prefetch count is set as 1.
 
-The `CachingBlockManager` is responsible for getting buffers from the buffer 
pool and reading data into them. This process of acquiring the buffer pool 
works as follows:
+The current block (block 0) is read synchronously, while the blocks to be 
prefetched (block 1) is
+read asynchronously.
 
+The `CachingBlockManager` is responsible for getting buffers from the buffer 
pool and reading data
+into them. This process of acquiring the buffer pool works as follows:
 
-* The buffer pool keeps a map of allocated buffers and a pool of available 
buffers. The size of this pool is = prefetch block count + 1. For the default 
value of 8, we will have a buffer pool of size 9.
+* The buffer pool keeps a map of allocated buffers and a pool of available 
buffers. The size of this
+  pool is = prefetch block count + 1. If the prefetch block count is 8, the 
buffer pool has a size
+  of 9.
 * If the pool is not yet at capacity, create a new buffer and add it to the 
pool.
-* If it’s at capacity, check if any buffers with state = done can be released. 
Releasing a buffer means removing it from allocated and returning it back to 
the pool of available buffers.
-* If we have no buffers with state = done currently then nothing will be 
released, so retry the above step at a fixed interval a few times till a buffer 
becomes available.
-* If after multiple retries we still don’t have an available buffer, release a 
buffer in the ready state. The buffer for the block furthest from the current 
block is released.
-
-Once a buffer has been acquired by `CachingBlockManager`, if the buffer is in 
a *READY* state, we can just return it. This means that data was already read 
into this buffer asynchronously by a prefetch. If it’s state is *BLANK,* then 
data is read into it using `S3Reader.read(ByteBuffer buffer, long offset, int 
size).`
-
-For the second read call, `in.read(buffer, 0, 8MB)`, since the block sizes are 
of 8MB and we have only read 5MB of block 0 so far, 3MB of the required data 
will be read from the current block 0. Once we’re done with this block, we’ll 
request the next block (block 1), which will already have been prefetched and 
so we can just start reading from it. Also, while reading from block 1 we will 
also issue prefetch requests for the next blocks. The number of blocks to be 
prefetched is determined by `fs.s3a.prefetch.block.count`, with the default 
being 8.
-
-
-### Random Reads
+* If it’s at capacity, check if any buffers with state = done can be released. 
Releasing a buffer
+  means removing it from allocated and returning it back to the pool of 
available buffers.
+* If there are no buffers with state = done currently then nothing will be 
released, so retry the
+  above step at a fixed interval a few times till a buffer becomes available.
+* If after multiple retries there are still no available buffers, release a 
buffer in the ready
+  state. The buffer for the block furthest from the current block is released.
+
+Once a buffer has been acquired by `CachingBlockManager`, if the buffer is in 
a *READY* state, it is
+returned. This means that data was already read into this buffer 
asynchronously by a prefetch. If
+it’s state is *BLANK,* then data is read into it
+using `S3Reader.read(ByteBuffer buffer, long offset, int size).`
+
+For the second read call, `in.read(buffer, 0, 8MB)`, since the block sizes are 
of 8MB and only `5MB`
+of block 0 has been read so far, 3MB of the required data will be read from 
the current block 0.

Review Comment:
   Why only 5MB has backticks? Let's drop or add to all.



##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/prefetching.md:
##########
@@ -14,94 +14,114 @@
 
 # S3A Prefetching
 
-
 This document explains the `S3PrefetchingInputStream` and the various 
components it uses.
 
-This input stream implements prefetching and caching to improve read 
performance of the input stream. A high level overview of this feature can also 
be found on 
[this](https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0)
 blogpost.
+This input stream implements prefetching and caching to improve read 
performance of the input
+stream. A high level overview of this feature was published in

Review Comment:
   For the line breaks, I believe Steve is suggesting each sentence start on a 
new line. If there's no empty line, Markdown will keep it part of the same 
paragraph.
   
   So in this instance, we should move "A high level.." onto the next line. We 
probably want to do this throughout the file.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 762184)
    Time Spent: 1h 40m  (was: 1.5h)

> document use and architecture design of prefetching s3a input stream
> --------------------------------------------------------------------
>
>                 Key: HADOOP-18177
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18177
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: documentation, fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Assignee: Ahmar Suhail
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Document S3PrefetchingInputStream for users  (including any new failure modes 
> in troubleshooting) and the architecture for maintainers
> there's some markdown in 
> hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/README.md 
> already



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-18177) document use and architecture design of prefetching s3a input stream

Reply via email to