ivandika3 commented on code in PR #10006:
URL: https://github.com/apache/ozone/pull/10006#discussion_r3196028251


##########
hadoop-hdds/docs/content/design/s3-multi-chunks-verification.md:
##########
@@ -0,0 +1,142 @@
+---
+title: S3 Multi Chunks Verification
+summary: Add signature verification support for AWS Signature V4 streaming 
chunked uploads in S3G.
+date: 2026-04-29
+jira: HDDS-12542
+status: proposed
+author: Chung-En Lee
+---
+<!--
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# Context & Motivation
+
+Ozone S3 Gateway (S3G) currently utilizes SignedChunksInputStream to handle 
aws-chunked content-encoding for AWS Signature V4. However, it doesn’t do any 
signature verification now. This proposal aims to complete the existing 
SignedChunksInputStream to make sure signature verification is correct and 
minimize performance overhead.
+
+# Goal
+
+Support signature verification for AWS Signature Version 4 streaming chunked 
uploads with the following algorithms:
+- STREAMING-AWS4-HMAC-SHA256-PAYLOAD
+- STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER
+- STREAMING-AWS4-ECDSA-P256-SHA256-PAYLOAD
+- STREAMING-AWS4-ECDSA-P256-SHA256-PAYLOAD-TRAILER
+
+# Proposed Solution
+
+Currently, the SignedChunksInputStream successfully parses the S3 chunked 
upload payload but lacks the actual signature verification. This proposal 
enhances the existing stream to perform real-time signature verification, while 
ensuring the output remains fully compatible with Ozone's native, 
high-throughput write APIs.
+
+## Secret Key
+
+Currently, the AWS Secret Keys are securely stored and managed exclusively 
within the Ozone Manager (OM). To enable the S3 Gateway (S3G) to independently 
verify chunked payloads, it requires access to verification materials. We 
propose adding a new internal OM API specifically for S3G to retrieve this data.

Review Comment:
   > Just to clarify, the verification process require the secret key. The 
official AWS documentation explains how the signing key is generated. See 
https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html.
   At step 2.
   
   Yes you're right we do need the secret key. Thanks for the clarification.
   
   > One-way Hashing: The key S3G receives is a derived key computed through 
multiple iterations of HMAC-SHA256. Since HMAC is a one-way cryptographic 
function, even if this derived key is compromised, it is impossible to get the 
original secret key
   
   Yes, I think we can use the signing key to mask the secret key (since it's 
hashed). The main security point is that the response returned by the OM should 
not include a plain text secret key.
   
   > For a multi-chunk streaming upload, this only adds one additional RPC per 
request. I believe the impact on overall performance will be small. Or maybe we 
could piggyback the derived key onto the metadata returned by the createKey 
call. This would eliminate the extra RPC entirely by providing the key upfront 
for all subsequent chunk uploads.
   
   Hm, to be consistent (handle secret revocation change), S3G need to send an 
additional RPC per PutObject. However, I think this might have some performance 
implication since each PutObject will generate around 5+ RPC request (S3 secret 
fetch, get volume, get bucket, open key, commit key). Additionally, the 
S3GetSecretRequest is implemented as a Ratis transaction so the latency should 
be higher than normal read request (without Ratis). One idea is to cache the 
signing key and refresh periodically, but this has consistency and security 
implication since the secret key might already be revoked but we might still 
allow the requests, etc.
   
   The piggyback logic might work to amortize this secret key. We can think 
further on this.
   
   Our cluster uses a different strategy so I'm not that well-versed about 
this. We might need to get people more familiar with this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to