suxiaogang223 opened a new pull request, #61759:
URL: https://github.com/apache/doris/pull/61759

   ### What problem does this PR solve?
   
   Iceberg parquet position delete files currently treat the `file_path` column 
as dictionary-coded as long as the column chunk has a dictionary page. That 
check is too loose: parquet allows mixed encodings in the same column chunk, so 
a chunk can contain both dictionary-encoded and plain-encoded data pages.
   
   When that happens, Doris builds a `ColumnDictI32` for `file_path`, but the 
plain decoder later calls `insert_many_strings()`, which fails with:
   
   `Method insert_many_strings is not supported for ColumnDictionary`
   
   This PR fixes the issue by only using dictionary-backed decoding for Iceberg 
position delete `file_path` columns when the entire parquet column chunk is 
fully dictionary encoded. Mixed-encoding chunks now fall back to normal string 
columns.
   
   It also adds BE unit coverage for:
   - fully dictionary-encoded parquet metadata
   - mixed dictionary/plain parquet metadata
   - parquet metadata without `encoding_stats` but with non-dictionary encodings
   
   ### Release note
   
   None
   
   ### Check List
   
   - [x] This issue was confirmed with code analysis and user logs
   - [x] This change includes unit test coverage
   - [ ] Local unit tests were run in this environment
   
   ### Testing
   
   Local `git diff --check` passed.
   BE unit test execution was not run locally because the current build 
directory on this machine does not include the `doris_be_test` target.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to