davidradl commented on code in PR #27937:
URL: https://github.com/apache/flink/pull/27937#discussion_r3092331154


##########
docs/content/docs/deployment/filesystems/s3.md:
##########
@@ -64,94 +64,288 @@ env.configure(config);
 
 Note that these examples are *not* exhaustive and you can use S3 in other 
places as well, including your [high availability setup]({{< ref 
"docs/deployment/ha/overview" >}}) or the [EmbeddedRocksDBStateBackend]({{< ref 
"docs/ops/state/state_backends" >}}#the-rocksdbstatebackend); everywhere that 
Flink expects a FileSystem URI (unless otherwise stated).
 
-For most use cases, you may use one of our `flink-s3-fs-hadoop` and 
`flink-s3-fs-presto` S3 filesystem plugins which are self-contained and easy to 
set up.
-For some cases, however, e.g., for using S3 as YARN's resource storage dir, it 
may be necessary to set up a specific Hadoop S3 filesystem implementation.
+## S3 FileSystem Implementations
 
-### Hadoop/Presto S3 File Systems plugins
+Flink provides three independent S3 filesystem implementations, each with 
different trade-offs:
+
+- **Native S3 FileSystem** (`flink-s3-fs-native`): Built directly on AWS SDK 
v2 with async I/O and parallel transfers removing the dependency from hadoop 
entirely. This implementation supports both checkpointing and the FileSystem 
sink. The Native S3 FileSystem aims to provide integrated support for 
checkpointing as well as FileSystem sink, removing the need to use Presto S3 
FileSystem for checkpointing and Hadoop S3 FileSystem for the FileSystem sink. 
[Benchmarks](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406620396)
 show ~2x higher checkpoint throughput (~200 MB/s vs ~90 MB/s) compared to the 
Presto implementation at state sizes up to 15 GB. **Experimental** in Flink 2.3.

Review Comment:
   I am not sure what you mean by both checkpointing and the FileSystem sink. I 
assume it is the same as the next sentence, if so we should remove this 
sentence. 
   
   nit: as well as FileSystem sink -> as well as the FileSystem sink.
   
   Some questions that came up for me:
   - Are there any types of sinks that are not supported? If not we could say 
this supports all types of sync including integrated support for checkpointing 
as well as FileSystem support.
   - does this mean that FileSystem is not a checkpointing sink? This is bit 
confusing as the docs talk of checkpointing FileSystemCheckpointStorage. I 
wonder if we could be clear to prevent misreadings. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to