carloea2 commented on code in PR #4136:
URL: https://github.com/apache/texera/pull/4136#discussion_r2656969665


##########
common/workflow-core/src/main/scala/org/apache/texera/service/util/S3StorageClient.scala:
##########
@@ -259,4 +259,59 @@ object S3StorageClient {
       DeleteObjectRequest.builder().bucket(bucketName).key(objectKey).build()
     )
   }
+
+  /**
+    * Uploads a single part for an in-progress S3 multipart upload.
+    *
+    * This method wraps the AWS SDK v2 {@code UploadPart} API:
+    * it builds an {@link 
software.amazon.awssdk.services.s3.model.UploadPartRequest}
+    * and streams the part payload via a {@link 
software.amazon.awssdk.core.sync.RequestBody}.
+    *
+    * Payload handling:
+    *   - If {@code contentLength} is provided, the payload is streamed 
directly from {@code inputStream}
+    *     using {@code RequestBody.fromInputStream(inputStream, len)}.
+    *   - If {@code contentLength} is {@code None}, the entire {@code 
inputStream} is read into memory
+    *     ({@code readAllBytes}) and uploaded using {@code 
RequestBody.fromBytes(bytes)}.
+    *     This is convenient but can be memory-expensive for large parts; 
prefer providing a known length.
+    *
+    * Notes:
+    *   - {@code partNumber} must be in the valid S3 range (typically 
1..10,000).
+    *   - The caller is responsible for closing {@code inputStream}.
+    *   - This method is synchronous and will block the calling thread until 
the upload completes.
+    *
+    * @param bucket        S3 bucket name.
+    * @param key           Object key (path) being uploaded.
+    * @param uploadId      Multipart upload identifier returned by 
CreateMultipartUpload.
+    * @param partNumber    1-based part number for this upload.
+    * @param inputStream   Stream containing the bytes for this part.
+    * @param contentLength Optional size (in bytes) of this part; provide it 
to avoid buffering in memory.
+    * @return              The {@link 
software.amazon.awssdk.services.s3.model.UploadPartResponse},
+    *                      including the part ETag used for completing the 
multipart upload.
+    */
+  def uploadPart(
+      bucket: String,
+      key: String,
+      uploadId: String,
+      partNumber: Int,
+      inputStream: InputStream,
+      contentLength: Option[Long]
+  ): UploadPartResponse = {
+    val body: RequestBody = contentLength match {
+      case Some(len) => RequestBody.fromInputStream(inputStream, len)
+      case None =>

Review Comment:
   Same as above



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to