[GitHub] [ozone] captainzmc commented on a change in pull request #1515: HDDS-4373. [Design] Ozone support append operation

GitBox Wed, 28 Oct 2020 00:35:23 -0700


captainzmc commented on a change in pull request #1515:
URL: https://github.com/apache/ozone/pull/1515#discussion_r513230594




##########
File path: hadoop-hdds/docs/content/design/append.md
##########
@@ -0,0 +1,87 @@
+---
+title: Append
+summary: Append to the existing key.
+date: 2020-10-22
+jira: HDDS-4333
+status: implementing
+author: captainzmc
+---
+<!--
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+## Introduction
+This is a proposal to introduce append operation for Ozone, which will allow 
write data in the tail of an existing file.
+ 
+## Goals
+ OzoneClient and OzoneFS Client support append operation. 
+ While the original key is appended to the write, the key needs to be readable 
by other clients.  
+ After the OutputStream of the new Append operation calls close, other clients 
can read the new Append content. This ensures consistency of read operations.
+## Non-goals
+The operation of hflush is not within the scope of this design. Created 
HDDS-4353 to discuss this.
+## Related jira
+https://issues.apache.org/jira/browse/HDDS-4333
+## Implementation
+### Background conditions：
+We can't currently open a closed Container. If append generates a new block 
every time, the key may have many smaller blocks less than 256MB(Default block 
size). Too many blocks will make the DB larger and also have an impact on read 
performance.
+
+### Solution：
+When Append occurs, determine if the container for the last block is closed. 
If it's closed, we create a new block. if it's open we append data to the last 
block. This can avoid creating new blocks as much as possible.
+                                                                               
                                                                                
               
+### Request process：
+![avatar](doc-image/append.png)
+
+ 1. Client executes append key operation to OM
+
+ 2. OM checks if the key is in appendTable; if so, the key is being called by 
another client append. we cannot append this key at this point. If not, add the 
key to appendTable.
+
+ 3. Check whether the last block of the key belongs to a closed container, if 
so, apply to SCM allocate a new block, if not, use the current block directly.

Review comment:
       Thanks  for @arp7's review. Does changing an existing block cause any 
other problems?
   As far as I know, we can't change the block under the closed container 
because we can't try to reopen the container.However, the block under the 
container that is not closed can be modified. 
   The purpose of this is to reduce the number of key blocks. If a key Append 
is particularly frequent, each append will generate a new block. And then 
finally this key is going to produce a bunch of little blocks. Too many little 
blocks will make the DB larger and also have an impact on read performance.
   The current [ozone truncate 
design](https://github.com/apache/ozone/pull/1504/files) also needs to modify 
the existing block. 
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [ozone] captainzmc commented on a change in pull request #1515: HDDS-4373. [Design] Ozone support append operation

Reply via email to