seawinde opened a new pull request, #65076:
URL: https://github.com/apache/doris/pull/65076

   ### What problem does this PR solve?
   
   Issue Number: N/A
   
   Related PR: N/A
   
   Problem Summary:
   
   Row binlog schema and the BE row-binlog writer previously treated normal 
row-binlog columns as visible source columns only. This misses hidden key 
columns, while hidden non-key internal columns such as 
sequence/delete/version/skip-bitmap columns should still be excluded from row 
binlog.
   
   Root cause: In `OlapTable.generateTableRowBinlogSchema()` and 
`SchemaChangeHandler.addColumnRowBinlog()`, FE generated and maintained 
row-binlog schema from visible columns only. In `RowBinlogSegmentWriter` / 
`RowBinlogSourceDataWriter`, BE also used visible-column counts to map source 
columns to row-binlog normal columns.
   
   This PR makes row-binlog normal columns follow a simple source-schema prefix 
contract: visible columns plus hidden key columns are written, and trailing 
hidden non-key columns are skipped. FE schema generation and add-column 
schema-change sync now use this contract. BE row-binlog writing uses the same 
normal-column count for full writes, partial update filtering, key column 
materialization, and BEFORE value columns.
   
   | File | Change Description |
   |------|-------------------|
   | `OlapTable.java` | Generate row-binlog schema from full base schema, 
include hidden key columns, skip hidden non-key columns, and reject visible/key 
columns after hidden non-key columns. |
   | `SchemaChangeHandler.java` | Keep hidden key columns when syncing ADD 
COLUMN changes to row-binlog schema, while skipping hidden non-key columns. |
   | `row_binlog_segment_writer.*` | Use row-binlog normal column count instead 
of visible column count, filter partial-update source cids to normal columns, 
and collect all key columns in the normal prefix. |
   | FE/BE/regression tests | Cover hidden key schema generation/writing and 
hidden non-key exclusion. |
   
   ```mermaid
   graph TD
     A[Source tablet schema] --> B[FE row-binlog schema]
     B -->|visible columns + hidden key columns| C[Row-binlog normal columns]
     B -->|skip trailing hidden non-key columns| D[Internal hidden columns]
     C --> E[BE RowBinlogSourceDataWriter]
     E --> F[RowBinlogSegmentWriter writes row-binlog segment]
   ```
   
   ### Release note
   
   Fixed an issue where row binlog did not include hidden key columns in the 
row-binlog schema and write path.
   
   ### Check List (For Author)
   
   - Test
       - [x] Regression test
           - `./run-regression-test.sh --run -d row_binlog_p0 -s 
test_row_binlog_hidden_column_schema -forceGenOut`
       - [x] Unit Test
           - `./run-fe-ut.sh --run 
org.apache.doris.catalog.OlapTableRowBinlogSchemaTest,org.apache.doris.alter.SchemaChangeHandlerTest`
           - `PATH=/home/seawinde/apache-maven-3.9.12/bin:$PATH ./run-be-ut.sh 
--run --filter=RowBinlogSourceDataWriterTest.* -j20`
       - [x] Manual test
           - `./build-support/check-format.sh`
           - `git diff --check`
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason
   
   - Behavior changed:
       - [ ] No.
       - [x] Yes. Row-binlog now includes hidden key columns, while hidden 
non-key internal columns remain excluded.
   
   - Does this need documentation?
       - [x] No.
       - [ ] Yes.
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to