szehon-ho commented on a change in pull request #2963: URL: https://github.com/apache/iceberg/pull/2963#discussion_r688163674
########## File path: site/docs/aws.md ########## @@ -340,10 +340,9 @@ For more details, please read [S3 ACL Documentation](https://docs.aws.amazon.com ### Object Store File Layout S3 and many other cloud storage services [throttle requests based on object prefix](https://aws.amazon.com/premiumsupport/knowledge-center/s3-request-limit-avoid-throttling/). -This means data stored in a traditional Hive storage layout has bad read and write throughput since data files of the same partition are placed under the same prefix. -Iceberg by default uses the Hive storage layout, but can be switched to use a different `ObjectStoreLocationProvider`. -In this mode, a hash string is added to the beginning of each file path, so that files are equally distributed across all prefixes in an S3 bucket. -This results in minimized throttling and maximized throughput for S3-related IO operations. +Data stored in a traditional Hive storage layout can face bad read and write throughput as single table is stored under the same filepath prefix. Review comment: Nit: "a single table" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
