JunRuiLee opened a new pull request, #7328:
URL: https://github.com/apache/paimon/pull/7328

   <!-- Please specify the module before the PR name: [core] ... or [flink] ... 
-->
   
   ### Purpose
   
   This PR adds support for descriptor-based BLOB fields that copy raw data to 
a configured external target directory at write time. For these fields, Paimon 
writes the raw BLOB bytes to the target directory and stores only serialized 
`BlobDescriptor`s inline in data files. The change also adds validation for the 
new copied-data descriptor options and verifies that raw-data BLOB fields, 
descriptor-based BLOB fields, and descriptor-based BLOB fields with copied raw 
data can coexist in the same table.
   
   This PR also refines `MERGE INTO` validation for BLOB columns in Flink and 
Spark. Updates are still rejected for raw-data BLOB columns, but are now 
allowed for descriptor-based BLOB columns, including those whose raw data is 
copied to an external target directory at write time.
   
   
   ### Tests
   
   UT:
   - `BlobTableTest#testCopiedDescriptorBlobField`
   - `BlobTableTest#testThreeTypeBlobCoexistence`
   - `BlobTableTest#testCopiedDescriptorFieldValidationRequiresTargetDir`
   - `BlobTableTest#testCopiedDescriptorFieldMustBeSubsetOfDescriptorField`
   - `BlobTestBase`: `Blob: merge-into rejects updating raw-data BLOB column`
   - `BlobTestBase`: `Blob: merge-into updates non-blob column on descriptor 
blob table`
   - `BlobTestBase`: `Blob: merge-into updates descriptor blob column with 
copied data end-to-end`
   
   IT:
   - `BlobTableITCase#testCopiedDescriptorBlob`
   - `BlobTableITCase#testThreeTypeBlobCoexistence`
   - `BlobTableITCase#testCopiedDescriptorBlobMultipleWrites`
   - `DataEvolutionMergeIntoActionITCase#testUpdateRawBlobColumnThrowsError`
   - 
`DataEvolutionMergeIntoActionITCase#testUpdateNonBlobColumnOnDescriptorBlobTableSucceeds`
   - 
`DataEvolutionMergeIntoActionITCase#testUpdateCopiedDescriptorBlobColumnSucceeds`
   
   ### API and Format
   
   <!-- Does this change affect API or storage format -->
   
   ### Documentation
   
   <!-- Does this change introduce a new feature -->
   
   ### Generative AI tooling
   
   <!--
   If generative AI tooling has been used in the process of authoring this 
patch, please include the
   phrase: 'Generated-by: ' followed by the name of the tool and its version.
   If no, write 'No'.
   Please refer to the [ASF Generative Tooling 
Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to