This is an automated email from the ASF dual-hosted git repository. szetszwo pushed a commit to branch HDDS-4454 in repository https://gitbox.apache.org/repos/asf/ozone.git
commit 13ffca792ce59394f6076ddcb8c268298e0b0c3a Author: Tsz-Wo Nicholas Sze <[email protected]> AuthorDate: Tue Nov 8 17:16:45 2022 -0800 HDDS-7425. Add documentation for the new Streaming Pipeline feature. (#3913) (cherry picked from commit 7fc85ee1fe3702ed47be7d645d1e388e917670d8) --- .../content/feature/Streaming-Write-Pipeline.md | 136 +++++++++++++++++++++ 1 file changed, 136 insertions(+) diff --git a/hadoop-hdds/docs/content/feature/Streaming-Write-Pipeline.md b/hadoop-hdds/docs/content/feature/Streaming-Write-Pipeline.md new file mode 100644 index 0000000000..5f55afebc3 --- /dev/null +++ b/hadoop-hdds/docs/content/feature/Streaming-Write-Pipeline.md @@ -0,0 +1,136 @@ +--- +title: "Streaming Write Pipeline" +weight: 1 +menu: + main: + parent: Features +summary: A new write pipeline using Ratis Streaming. +--- +<!--- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +This document discusses the new Streaming Write Pipeline feature in Ozone. +It is implemented with the Ratis Streaming API. +Note that the existing Ozone Write Pipeline is implemented with the Ratis Async API. +We refer the new Streaming Write Pipeline as Write Pipeline V2 +and the existing Async Write Pipeline as Write Pipeline V1. + +The Streaming Write Pipeline V2 increases the performance +by providing better network topology awareness +and removing the performance bottlenecks in V1. +The V2 implementation also avoids unnecessary buffer copying +(by Netty zero copy) +and has a better utilization of the CPUs and the disks in each datanode. + +## Configuration Properties + +Set the following properties to the Ozone configuration file `ozone-site.xml`. + +- To enable the Streaming Write Pipeline feature, set the following property to true. +```XML + <property> + <name>dfs.container.ratis.datastream.enabled</name> + <value>false</value> + <tag>OZONE, CONTAINER, RATIS, DATASTREAM</tag> + <description>It specifies whether to enable data stream of container.</description> + </property> +``` +- Datanodes listen to the following port for the streaming traffic. +```XML + <property> + <name>dfs.container.ratis.datastream.port</name> + <value>9855</value> + <tag>OZONE, CONTAINER, RATIS, DATASTREAM</tag> + <description>The datastream port number of container.</description> + </property> +``` +- To use Streaming in FileSystem API, set the following property to true. +```XML + <property> + <name>ozone.fs.datastream.enabled</name> + <value>false</value> + <tag>OZONE, DATANODE</tag> + <description> + To enable/disable filesystem write via ratis streaming. + </description> + </property> +``` + +## Client APIs + +### OzoneDataStreamOutput + +The new `OzoneDataStreamOutput` class is very similar to the existing `OzoneOutputStream` class, +except that `OzoneDataStreamOutput` uses `ByteBuffer` as a parameter in the `write` methods +while `OzoneOutputStream` uses `byte[]`. +The reason of using a `ByteBuffer`, instead of a `byte[]`, +is to support zero buffer copying. +A typical `write` method is shown below: + +- OzoneDataStreamOutput +```java + public void write(ByteBuffer b, int off, int len) throws IOException; +``` + +- OzoneOutputStream +```java + public void write(byte[] b, int off, int len) throws IOException; +``` +### OzoneBucket + +The following new methods are added to `OzoneBucket` +for creating keys using the Streaming Write Pipeline. + +- createStreamKey +```java + public OzoneDataStreamOutput createStreamKey(String key, long size) + throws IOException; +``` + +```java + public OzoneDataStreamOutput createStreamKey(String key, long size, + ReplicationConfig replicationConfig, Map<String, String> keyMetadata) + throws IOException; +``` +- createMultipartStreamKey +```java + public OzoneDataStreamOutput createMultipartStreamKey(String key, long size, + int partNumber, String uploadID) throws IOException; +``` + +Note that the methods above have the same parameter list +as the existing `createKey` and `createMultipartKey` methods. + +Below is an example to create a key from a local file using a memory-mapped buffer. +```java + // Create a memory-mapped buffer from a local file: + final FileChannel channel = ... // local file channel + final long length = ... // length of the data + final ByteBuffer mapped = channel.map(FileChannel.MapMode.READ_ONLY, 0, length); + + // Create an OzoneDataStreamOutput + final OzoneBucket bucket = ... // an Ozone bucket + final String key = ... // the key name + final OzoneDataStreamOutput out = bucket.createStreamKey(key, length); + + // Write the memory-mapped buffer to the key output + out.write(mapped); + + // close + out.close(); // In practice, use try-with-resource to close it. + channel.close(); // In practice, use try-with-resource to close it. +``` \ No newline at end of file --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
