Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-22 Thread via GitHub


hevinhsu commented on PR #9612:
URL: https://github.com/apache/ozone/pull/9612#issuecomment-3783084421

   Thanks @ivandika3 for the reviews and merge!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-21 Thread via GitHub


ivandika3 commented on PR #9612:
URL: https://github.com/apache/ozone/pull/9612#issuecomment-3782891633

   Thanks @hevinhsu for the improvement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-21 Thread via GitHub


ivandika3 merged PR #9612:
URL: https://github.com/apache/ozone/pull/9612


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-21 Thread via GitHub


hevinhsu commented on code in PR #9612:
URL: https://github.com/apache/ozone/pull/9612#discussion_r2714961881


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -472,10 +472,10 @@ public Response get(
   responseBuilder
   .header(ACCEPT_RANGE_HEADER, RANGE_HEADER_SUPPORTED_UNIT);
 
-  String eTag = keyDetails.getMetadata().get(OzoneConsts.ETAG);
-  if (eTag != null) {
-responseBuilder.header(HttpHeaders.ETAG, wrapInQuotes(eTag));
-String partsCount = extractPartsCount(eTag);
+  String md5Hash = keyDetails.getMetadata().get(OzoneConsts.ETAG);
+  if (md5Hash != null) {
+responseBuilder.header(HttpHeaders.ETAG, wrapInQuotes(md5Hash));
+String partsCount = extractPartsCount(md5Hash);

Review Comment:
   Thanks for pointing this out. Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-21 Thread via GitHub


ivandika3 commented on code in PR #9612:
URL: https://github.com/apache/ozone/pull/9612#discussion_r2711575395


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -472,10 +472,10 @@ public Response get(
   responseBuilder
   .header(ACCEPT_RANGE_HEADER, RANGE_HEADER_SUPPORTED_UNIT);
 
-  String eTag = keyDetails.getMetadata().get(OzoneConsts.ETAG);
-  if (eTag != null) {
-responseBuilder.header(HttpHeaders.ETAG, wrapInQuotes(eTag));
-String partsCount = extractPartsCount(eTag);
+  String md5Hash = keyDetails.getMetadata().get(OzoneConsts.ETAG);
+  if (md5Hash != null) {
+responseBuilder.header(HttpHeaders.ETAG, wrapInQuotes(md5Hash));
+String partsCount = extractPartsCount(md5Hash);

Review Comment:
   Nit: This should still be `ETag` since we are actually getting the key ETag 
which may not be MD5. We should only use `md5Hash` if we are actually 
calculating MD5.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-21 Thread via GitHub


ivandika3 commented on code in PR #9612:
URL: https://github.com/apache/ozone/pull/9612#discussion_r2711575395


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -472,10 +472,10 @@ public Response get(
   responseBuilder
   .header(ACCEPT_RANGE_HEADER, RANGE_HEADER_SUPPORTED_UNIT);
 
-  String eTag = keyDetails.getMetadata().get(OzoneConsts.ETAG);
-  if (eTag != null) {
-responseBuilder.header(HttpHeaders.ETAG, wrapInQuotes(eTag));
-String partsCount = extractPartsCount(eTag);
+  String md5Hash = keyDetails.getMetadata().get(OzoneConsts.ETAG);
+  if (md5Hash != null) {
+responseBuilder.header(HttpHeaders.ETAG, wrapInQuotes(md5Hash));
+String partsCount = extractPartsCount(md5Hash);

Review Comment:
   Nit: This should still be `ETag`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-21 Thread via GitHub


hevinhsu commented on PR #9612:
URL: https://github.com/apache/ozone/pull/9612#issuecomment-3776723119

   Thanks @ivandika3 for the reviews.
   
   I’ve fixed the merge-related issue you pointed out and addressed the nits, 
including renaming `eTag` to `md5Hash`.
   
   I agree with the rename. As you pointed out, ETag is not guaranteed to be an 
MD5 hash (especially for multipart uploads), so using `md5Hash` avoids 
conflating ETag semantics with an MD5 digest.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-16 Thread via GitHub


ivandika3 commented on PR #9612:
URL: https://github.com/apache/ozone/pull/9612#issuecomment-3762666502

   > The ETag may or may not be an MD5 digest of the object data.
   
   Yes, ETag can technically be any hash that based on the object data so it 
doesn't need to be MD5. Moreover, object uploaded using multipart upload ETag 
is a hash of all the ETag of all the parts with the "-" prefix, see 
`S3MultipartUploadCompleteRequest#multipartUploadedKeyHash`. 
   
   If we decide in the future to change ETag to not be md5 hash, then we need 
to setup a new MD5 `MessageDigest`. However, since ETag is currently already 
using MD5, we can piggy back that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-16 Thread via GitHub


ivandika3 commented on code in PR #9612:
URL: https://github.com/apache/ozone/pull/9612#discussion_r2700634883


##
hadoop-ozone/integration-test-s3/src/test/java/org/apache/hadoop/ozone/s3/awssdk/v2/AbstractS3SDKV2Tests.java:
##
@@ -235,7 +236,6 @@ public void testPutObject() {
 assertEquals("\"37b51d194a7513e45b56f6524f2d51f2\"", 
getObjectResponse.eTag());
   }
 
-  @Test

Review Comment:
   Wrong omission?



##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -1064,8 +1075,15 @@ private Response createMultipartKey(OzoneVolume volume, 
OzoneBucket ozoneBucket,
   putLength = IOUtils.copyLarge(multiDigestInputStream, 
ozoneOutputStream, 0, length,
   new byte[getIOBufferSize(length)]);
   byte[] digest = 
multiDigestInputStream.getMessageDigest(OzoneConsts.MD5_HASH).digest();
-  ozoneOutputStream.getMetadata()
-  .put(OzoneConsts.ETAG, 
DatatypeConverter.printHexBinary(digest).toLowerCase());
+  String eTag = DatatypeConverter.printHexBinary(digest).toLowerCase();
+  String clientContentMD5 = 
getHeaders().getHeaderString(S3Consts.CHECKSUM_HEADER);
+  if (clientContentMD5 != null) {
+CheckedRunnable checkContentMD5Hook = () -> {
+  S3Utils.validateContentMD5(clientContentMD5, eTag, key);
+};
+
ozoneOutputStream.getKeyOutputStream().setPreCommits(Collections.singletonList(checkContentMD5Hook));
+  }
+  ozoneOutputStream.getMetadata().put(OzoneConsts.ETAG, eTag);

Review Comment:
   Nit: Since now MD5 hash is used for both ETag and content-md5 verification, 
let's rename variable `eTag` to `md5Hash`. This also applied to other places.
   
   Let's also change `getMessageDigestDistance` and `ETAG_PROVIDER` to 
something like `getMD5DigestInstance` and `MD5_PROVIDER`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-13 Thread via GitHub


hevinhsu commented on PR #9612:
URL: https://github.com/apache/ozone/pull/9612#issuecomment-3747583971

   Thanks @jojochuang for pointing this out.
   
   I agree that the final ETag may not always be an MD5 (especially for 
multipart or encrypted objects). However, this patch focuses on in-transit 
integrity checking via the `Content-MD5` header.
   
   During the upload, S3G calculates the MD5 of the incoming data stream to 
verify it against the client-provided `Content-MD5`. This verification happens 
before the object is committed. The document you shared refers to the ETag's 
behavior after the commit, which is independent of the `Content-MD5` validation 
logic at the ingestion stage. Therefore, the ETag's final format does not 
affect this check.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2026-01-13 Thread via GitHub


jojochuang commented on PR #9612:
URL: https://github.com/apache/ozone/pull/9612#issuecomment-3745338215

   Note: 
   https://docs.aws.amazon.com/AmazonS3/latest/API/API_Object.html
   
   > The ETag may or may not be an MD5 digest of the object data. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-12-13 Thread via GitHub


github-actions[bot] closed pull request #8506: HDDS-10633. Support Content-MD5 
header for checking object integrity when uploading object
URL: https://github.com/apache/ozone/pull/8506


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-12-13 Thread via GitHub


github-actions[bot] commented on PR #8506:
URL: https://github.com/apache/ozone/pull/8506#issuecomment-3649953394

   Thank you for your contribution. This PR is being closed due to inactivity. 
If needed, feel free to reopen it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-12-06 Thread via GitHub


github-actions[bot] commented on PR #8506:
URL: https://github.com/apache/ozone/pull/8506#issuecomment-3621391647

   This PR has been marked as stale due to 21 days of inactivity. Please 
comment or remove the stale label to keep it open. Otherwise, it will be 
automatically closed in 7 days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-11-14 Thread via GitHub


Jimmyweng006 commented on PR #8506:
URL: https://github.com/apache/ozone/pull/8506#issuecomment-3535649481

   Hi @ivandika3 , Thanks for checking. I’m not actively on this right now. 
Please feel free to take over the patch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-11-13 Thread via GitHub


ivandika3 commented on PR #8506:
URL: https://github.com/apache/ozone/pull/8506#issuecomment-3530592306

   @Jimmyweng006 Can I check whether you are still actively working on this? If 
not, I (or other community members) can continue this patch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-11-11 Thread via GitHub


github-actions[bot] commented on PR #8506:
URL: https://github.com/apache/ozone/pull/8506#issuecomment-3519267672

   This PR has been marked as stale due to 21 days of inactivity. Please 
comment or remove the stale label to keep it open. Otherwise, it will be 
automatically closed in 7 days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-05-25 Thread via GitHub


Jimmyweng006 commented on PR #8506:
URL: https://github.com/apache/ozone/pull/8506#issuecomment-2907838994

   > So two main points:
   > 
   > 1. We can do a check on the ETag before the output stream is closed (i.e. 
key is committed)
   > 2. I think we can remove the deletion / abort logic for now. We can let 
the user decide what to do for multipart upload parts (abort it or reupload the 
parts). If we want to have some cleanup logic, we probably need to expose a new 
client protocol to clean up open keys, but this can be done in a separate 
ticket.
   > 
   > Also as suggested by @peterxcli , we can integration tests and even 
acceptance tests to test the behaviors.
   > 
   > @Jimmyweng006 @peterxcli Let me know your thoughts.
   
   Thanks for @ivandika3 pointing out the risk of abort committed key. I will 
change to below behaviors for both put/completeMultipartUpload
   
   1. make the isETagMismatch check before committing key
   2. remove cleanup when isETagMismatch check is true
   
   Regarding the integration tests(AbstractS3SDKV1Tests/AbstractS3SDKV2Tests) & 
acceptance tests(robot test) I will add it as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-05-25 Thread via GitHub


ivandika3 commented on code in PR #8506:
URL: https://github.com/apache/ozone/pull/8506#discussion_r2106139934


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -896,15 +903,21 @@ public Response 
completeMultipartUpload(@PathParam("bucket") String bucket,
 LOG.debug("Parts map {}", partsMap);
   }
 
-  omMultipartUploadCompleteInfo = getClientProtocol()
+  OmMultipartUploadCompleteInfo omMultipartUploadCompleteInfo = 
getClientProtocol()
   .completeMultipartUpload(volume.getName(), bucket, key, uploadID,
   partsMap);
+  String serverEtag = omMultipartUploadCompleteInfo.getHash();
+  String encodedClientETag = headers.getHeaderString(CHECKSUM_HEADER);
+  if (S3Utils.isEtagMisMatch(encodedClientETag, serverEtag)) {
+abortMultipartUpload(volume, bucket, key, uploadID);

Review Comment:
   Is this multipart upload abort specified in the AWS S3 spec? If not, I'd 
prefer not to abort the incomplete multipart uploads. User can abort it 
themselves or we can let the background multipart upload cleanup service to 
abort it after a while (default 7 days).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-05-25 Thread via GitHub


ivandika3 commented on code in PR #8506:
URL: https://github.com/apache/ozone/pull/8506#discussion_r2106138338


##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/util/S3Utils.java:
##
@@ -198,4 +200,11 @@ public static String generateCanonicalUserId(String input) 
{
 return DigestUtils.sha256Hex(input);
   }
 
+  public static boolean isEtagMisMatch(String encodedClientETag, String 
serverETag) {

Review Comment:
   Nit: rename to `isETagMismatch`



##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -896,15 +903,21 @@ public Response 
completeMultipartUpload(@PathParam("bucket") String bucket,
 LOG.debug("Parts map {}", partsMap);
   }
 
-  omMultipartUploadCompleteInfo = getClientProtocol()
+  OmMultipartUploadCompleteInfo omMultipartUploadCompleteInfo = 
getClientProtocol()
   .completeMultipartUpload(volume.getName(), bucket, key, uploadID,
   partsMap);
+  String serverEtag = omMultipartUploadCompleteInfo.getHash();
+  String encodedClientETag = headers.getHeaderString(CHECKSUM_HEADER);
+  if (S3Utils.isEtagMisMatch(encodedClientETag, serverEtag)) {
+abortMultipartUpload(volume, bucket, key, uploadID);

Review Comment:
   Is this multipart upload abort specified in the AWS S3 spec? If not, I'd 
prefer not to abort the incomplete multipart uploads. User can abort it 
themselves or we can let the background multipart upload cleanup service to 
abort them after a while (default 7 days).



##
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##
@@ -342,6 +343,13 @@ public Response put(
   output.getMetadata().put(ETAG, eTag);
 }
   }
+
+  String encodedClientETag = headers.getHeaderString(CHECKSUM_HEADER);
+  if (S3Utils.isEtagMisMatch(encodedClientETag, eTag)) {
+delete(bucketName, keyPath, uploadID, null);

Review Comment:
   I'm a bit worried about this since if there are concurrent key creation from 
the user, it might delete other user's key instead. I would prefer if the ETag 
checking is done before the key is committed (i.e. before `OzoneOutputStream` 
is closed). In the case of multipart uploads, user can decide to continue 
uploading the same part again, instead of aborting the multipart uploads (that 
might already contain a lot of large parts).
   
   Since we have an open key cleanup service, the single / multipart upload 
open keys will be deleted eventually. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-05-25 Thread via GitHub


Jimmyweng006 commented on code in PR #8506:
URL: https://github.com/apache/ozone/pull/8506#discussion_r2106120620


##
hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/endpoint/TestObjectPut.java:
##
@@ -758,4 +768,89 @@ public void testPutEmptyObject() throws IOException, 
OS3Exception {
 OzoneKeyDetails keyDetails = 
clientStub.getObjectStore().getS3Bucket(BUCKET_NAME).getKey(KEY_NAME);
 assertEquals(0, keyDetails.getDataSize());
   }
+
+  @Test
+  void testPutObjectWithEtagMismatchShouldCleanupAndThrow() throws 
IOException, OS3Exception, DecoderException {
+// Arrange
+String content = "test-content";
+ByteArrayInputStream body = new 
ByteArrayInputStream(content.getBytes(UTF_8));
+
bucket.setReplicationConfig(RatisReplicationConfig.getInstance(HddsProtos.ReplicationFactor.THREE));
+
+// Mock headers to return a mismatched checksum
+HttpHeaders etagHeaders = mock(HttpHeaders.class);
+String fakeMd5Hex = "deadbeefdeadbeefdeadbeefdeadbeef";
+byte[] fakeMd5Bytes = Hex.decodeHex(fakeMd5Hex);
+String encodedEtag = Base64.getEncoder().encodeToString(fakeMd5Bytes);
+when(etagHeaders.getHeaderString(CHECKSUM_HEADER)).thenReturn(encodedEtag);
+
when(etagHeaders.getHeaderString(X_AMZ_CONTENT_SHA256)).thenReturn(UNSIGNED_PAYLOAD);
+objectEndpoint.setHeaders(etagHeaders);
+
+// Spy on objectEndpoint.delete to verify cleanup is called
+ObjectEndpoint spyEndpoint = spy(objectEndpoint);
+doReturn(Response.ok().build()).when(spyEndpoint).delete(any(), any(), 
any(), any());
+
+// Act
+OS3Exception os3Exception = assertThrows(OS3Exception.class, () ->
+spyEndpoint.put(BUCKET_NAME, KEY_NAME, content.length(), 1,
+null, null, null, body));
+
+// Assert
+verify(spyEndpoint, times(1)).delete(any(), any(), any(), any());
+assertEquals(BAD_DIGEST.getCode(), os3Exception.getCode());
+assertEquals(HTTP_BAD_REQUEST, os3Exception.getHttpCode());
+  }
+
+  @Test
+  void testPutObjectWithEtagMatchShouldNotCleanupOrThrow() throws IOException, 
OS3Exception, DecoderException {
+// Arrange
+String content = "test-content-match";
+ByteArrayInputStream body = new 
ByteArrayInputStream(content.getBytes(UTF_8));
+
bucket.setReplicationConfig(RatisReplicationConfig.getInstance(HddsProtos.ReplicationFactor.THREE));
+
+// get server etag
+ObjectEndpoint spyEndpoint = spy(objectEndpoint);
+doReturn(Response.ok().build()).when(spyEndpoint).delete(any(), any(), 
any(), any());
+Response tempResp = spyEndpoint.put(BUCKET_NAME, KEY_NAME, 
content.length(), 1, null,

Review Comment:
   I think this one is ok as same content used in first & second put will have 
same ETag, and the second put just override the first put result.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-05-25 Thread via GitHub


Jimmyweng006 commented on code in PR #8506:
URL: https://github.com/apache/ozone/pull/8506#discussion_r2106115047


##
hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/endpoint/TestMultipartUploadComplete.java:
##
@@ -266,4 +270,96 @@ public void testMultipartInvalidPartError() throws 
Exception {
 () -> completeMultipartUpload(key, completeMultipartUploadRequest, 
uploadID));
 assertEquals(ex.getCode(), S3ErrorTable.INVALID_PART.getCode());
   }
+
+  @Test
+  public void testMultipartCompleteWithEtagMismatchShouldAbortAndThrow() 
throws Exception {
+// Arrange
+String key = UUID.randomUUID().toString();
+String uploadID = initiateMultipartUpload(key);
+CompleteMultipartUploadRequest completeMultipartUploadRequest =
+buildCompleteMultipartUploadRequest(key, uploadID);
+
+// Mock headers to return a mismatched etag
+HttpHeaders etagHeaders = mock(HttpHeaders.class);
+String fakeMd5Hex = "deadbeefdeadbeefdeadbeefdeadbeef";
+byte[] fakeMd5Bytes = Hex.decodeHex(fakeMd5Hex);
+String encodedEtag = Base64.getEncoder().encodeToString(fakeMd5Bytes);
+when(etagHeaders.getHeaderString(CHECKSUM_HEADER)).thenReturn(
+encodedEtag);
+rest.setHeaders(etagHeaders);
+
+// Act & Assert
+OS3Exception os3Exception = assertThrows(OS3Exception.class,
+() -> rest.completeMultipartUpload(OzoneConsts.S3_BUCKET, key, 
uploadID, completeMultipartUploadRequest));
+assertEquals(S3ErrorTable.BAD_DIGEST.getCode(), os3Exception.getCode());
+assertEquals(HTTP_BAD_REQUEST, os3Exception.getHttpCode());
+  }
+
+  @Test
+  public void testMultipartCompleteWithEtagMatchShouldSucceed() throws 
Exception {
+// Arrange
+String key = UUID.randomUUID().toString();
+String uploadID = initiateMultipartUpload(key);
+CompleteMultipartUploadRequest completeMultipartUploadRequest =
+buildCompleteMultipartUploadRequest(key, uploadID);
+
+// calculate correct etag
+Response tempResp =
+rest.completeMultipartUpload(OzoneConsts.S3_BUCKET, key, uploadID, 
completeMultipartUploadRequest);

Review Comment:
   Yes I think you are right, using the same uploadID in second 
**completeMultipartUpload** should cause NoSuchUpload error but somehow this 
error did not show in unit test... let me use new **initializeMultipartUpload** 
for second **completeMultipartUpload**.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-05-24 Thread via GitHub


Jimmyweng006 commented on code in PR #8506:
URL: https://github.com/apache/ozone/pull/8506#discussion_r2105879827


##
hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/endpoint/TestMultipartUploadComplete.java:
##
@@ -266,4 +270,96 @@ public void testMultipartInvalidPartError() throws 
Exception {
 () -> completeMultipartUpload(key, completeMultipartUploadRequest, 
uploadID));
 assertEquals(ex.getCode(), S3ErrorTable.INVALID_PART.getCode());
   }
+
+  @Test
+  public void testMultipartCompleteWithEtagMismatchShouldAbortAndThrow() 
throws Exception {
+// Arrange
+String key = UUID.randomUUID().toString();
+String uploadID = initiateMultipartUpload(key);
+CompleteMultipartUploadRequest completeMultipartUploadRequest =
+buildCompleteMultipartUploadRequest(key, uploadID);
+
+// Mock headers to return a mismatched etag
+HttpHeaders etagHeaders = mock(HttpHeaders.class);
+String fakeMd5Hex = "deadbeefdeadbeefdeadbeefdeadbeef";
+byte[] fakeMd5Bytes = Hex.decodeHex(fakeMd5Hex);
+String encodedEtag = Base64.getEncoder().encodeToString(fakeMd5Bytes);
+when(etagHeaders.getHeaderString(CHECKSUM_HEADER)).thenReturn(
+encodedEtag);
+rest.setHeaders(etagHeaders);
+
+// Act & Assert
+OS3Exception os3Exception = assertThrows(OS3Exception.class,
+() -> rest.completeMultipartUpload(OzoneConsts.S3_BUCKET, key, 
uploadID, completeMultipartUploadRequest));

Review Comment:
   Thanks for remind, I should align the behavior for both Put & 
MultipartComplete under EtagMismatchShouldCleanupAndThrow testing scenario.
   
   I also make MultipartComplete use delete instead of private function 
abortMultipartUpload for verify invocation times.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-05-24 Thread via GitHub


Jimmyweng006 commented on code in PR #8506:
URL: https://github.com/apache/ozone/pull/8506#discussion_r2105879827


##
hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/endpoint/TestMultipartUploadComplete.java:
##
@@ -266,4 +270,96 @@ public void testMultipartInvalidPartError() throws 
Exception {
 () -> completeMultipartUpload(key, completeMultipartUploadRequest, 
uploadID));
 assertEquals(ex.getCode(), S3ErrorTable.INVALID_PART.getCode());
   }
+
+  @Test
+  public void testMultipartCompleteWithEtagMismatchShouldAbortAndThrow() 
throws Exception {
+// Arrange
+String key = UUID.randomUUID().toString();
+String uploadID = initiateMultipartUpload(key);
+CompleteMultipartUploadRequest completeMultipartUploadRequest =
+buildCompleteMultipartUploadRequest(key, uploadID);
+
+// Mock headers to return a mismatched etag
+HttpHeaders etagHeaders = mock(HttpHeaders.class);
+String fakeMd5Hex = "deadbeefdeadbeefdeadbeefdeadbeef";
+byte[] fakeMd5Bytes = Hex.decodeHex(fakeMd5Hex);
+String encodedEtag = Base64.getEncoder().encodeToString(fakeMd5Bytes);
+when(etagHeaders.getHeaderString(CHECKSUM_HEADER)).thenReturn(
+encodedEtag);
+rest.setHeaders(etagHeaders);
+
+// Act & Assert
+OS3Exception os3Exception = assertThrows(OS3Exception.class,
+() -> rest.completeMultipartUpload(OzoneConsts.S3_BUCKET, key, 
uploadID, completeMultipartUploadRequest));

Review Comment:
   Thanks for remind, I should align the behavior for both Put & 
MultipartComplete under EtagMismatchShouldAbortAndThrow testing scenario.
   
   I also make MultipartComplete use delete instead of private function 
abortMultipartUpload for verify invocation times.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object [ozone]

2025-05-24 Thread via GitHub


peterxcli commented on code in PR #8506:
URL: https://github.com/apache/ozone/pull/8506#discussion_r2105860626


##
hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/endpoint/TestMultipartUploadComplete.java:
##
@@ -266,4 +270,96 @@ public void testMultipartInvalidPartError() throws 
Exception {
 () -> completeMultipartUpload(key, completeMultipartUploadRequest, 
uploadID));
 assertEquals(ex.getCode(), S3ErrorTable.INVALID_PART.getCode());
   }
+
+  @Test
+  public void testMultipartCompleteWithEtagMismatchShouldAbortAndThrow() 
throws Exception {
+// Arrange
+String key = UUID.randomUUID().toString();
+String uploadID = initiateMultipartUpload(key);
+CompleteMultipartUploadRequest completeMultipartUploadRequest =
+buildCompleteMultipartUploadRequest(key, uploadID);
+
+// Mock headers to return a mismatched etag
+HttpHeaders etagHeaders = mock(HttpHeaders.class);
+String fakeMd5Hex = "deadbeefdeadbeefdeadbeefdeadbeef";
+byte[] fakeMd5Bytes = Hex.decodeHex(fakeMd5Hex);
+String encodedEtag = Base64.getEncoder().encodeToString(fakeMd5Bytes);
+when(etagHeaders.getHeaderString(CHECKSUM_HEADER)).thenReturn(
+encodedEtag);
+rest.setHeaders(etagHeaders);
+
+// Act & Assert
+OS3Exception os3Exception = assertThrows(OS3Exception.class,
+() -> rest.completeMultipartUpload(OzoneConsts.S3_BUCKET, key, 
uploadID, completeMultipartUploadRequest));
+assertEquals(S3ErrorTable.BAD_DIGEST.getCode(), os3Exception.getCode());
+assertEquals(HTTP_BAD_REQUEST, os3Exception.getHttpCode());
+  }
+
+  @Test
+  public void testMultipartCompleteWithEtagMatchShouldSucceed() throws 
Exception {
+// Arrange
+String key = UUID.randomUUID().toString();
+String uploadID = initiateMultipartUpload(key);
+CompleteMultipartUploadRequest completeMultipartUploadRequest =
+buildCompleteMultipartUploadRequest(key, uploadID);
+
+// calculate correct etag
+Response tempResp =
+rest.completeMultipartUpload(OzoneConsts.S3_BUCKET, key, uploadID, 
completeMultipartUploadRequest);

Review Comment:
   Is the duplicate(line 308, 321) completeMultipartUpload request ok? Wouldn't 
the second one be ignored or skipped?



##
hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/endpoint/TestObjectPut.java:
##
@@ -758,4 +768,89 @@ public void testPutEmptyObject() throws IOException, 
OS3Exception {
 OzoneKeyDetails keyDetails = 
clientStub.getObjectStore().getS3Bucket(BUCKET_NAME).getKey(KEY_NAME);
 assertEquals(0, keyDetails.getDataSize());
   }
+
+  @Test
+  void testPutObjectWithEtagMismatchShouldCleanupAndThrow() throws 
IOException, OS3Exception, DecoderException {
+// Arrange
+String content = "test-content";
+ByteArrayInputStream body = new 
ByteArrayInputStream(content.getBytes(UTF_8));
+
bucket.setReplicationConfig(RatisReplicationConfig.getInstance(HddsProtos.ReplicationFactor.THREE));
+
+// Mock headers to return a mismatched checksum
+HttpHeaders etagHeaders = mock(HttpHeaders.class);
+String fakeMd5Hex = "deadbeefdeadbeefdeadbeefdeadbeef";
+byte[] fakeMd5Bytes = Hex.decodeHex(fakeMd5Hex);
+String encodedEtag = Base64.getEncoder().encodeToString(fakeMd5Bytes);
+when(etagHeaders.getHeaderString(CHECKSUM_HEADER)).thenReturn(encodedEtag);
+
when(etagHeaders.getHeaderString(X_AMZ_CONTENT_SHA256)).thenReturn(UNSIGNED_PAYLOAD);
+objectEndpoint.setHeaders(etagHeaders);
+
+// Spy on objectEndpoint.delete to verify cleanup is called
+ObjectEndpoint spyEndpoint = spy(objectEndpoint);
+doReturn(Response.ok().build()).when(spyEndpoint).delete(any(), any(), 
any(), any());
+
+// Act
+OS3Exception os3Exception = assertThrows(OS3Exception.class, () ->
+spyEndpoint.put(BUCKET_NAME, KEY_NAME, content.length(), 1,
+null, null, null, body));
+
+// Assert
+verify(spyEndpoint, times(1)).delete(any(), any(), any(), any());
+assertEquals(BAD_DIGEST.getCode(), os3Exception.getCode());
+assertEquals(HTTP_BAD_REQUEST, os3Exception.getHttpCode());
+  }
+
+  @Test
+  void testPutObjectWithEtagMatchShouldNotCleanupOrThrow() throws IOException, 
OS3Exception, DecoderException {
+// Arrange
+String content = "test-content-match";
+ByteArrayInputStream body = new 
ByteArrayInputStream(content.getBytes(UTF_8));
+
bucket.setReplicationConfig(RatisReplicationConfig.getInstance(HddsProtos.ReplicationFactor.THREE));
+
+// get server etag
+ObjectEndpoint spyEndpoint = spy(objectEndpoint);
+doReturn(Response.ok().build()).when(spyEndpoint).delete(any(), any(), 
any(), any());
+Response tempResp = spyEndpoint.put(BUCKET_NAME, KEY_NAME, 
content.length(), 1, null,

Review Comment:
   > Is the duplicate(line 308, 321) completeMultipartUpload request ok? 
Wouldn't the second one be ignored or skipped?
   
   Same question as I left in TestMultipartUploadComple