Copilot commented on code in PR #60795:
URL: https://github.com/apache/doris/pull/60795#discussion_r2837862016
##########
be/src/vec/exec/format/table/paimon_cpp_reader.cpp:
##########
@@ -311,6 +311,22 @@ std::map<std::string, std::string>
PaimonCppReader::_build_options() const {
copy_if_missing("fs.s3a.region", "AWS_REGION");
copy_if_missing("fs.s3a.path.style.access", "use_path_style");
+ // FE currently may not pass paimon table options in scan ranges.
+ // Avoid paimon-cpp defaulting manifest.format to avro when split file
format is known.
+ if (_range.__isset.table_format_params &&
_range.table_format_params.__isset.paimon_params &&
+ _range.table_format_params.paimon_params.__isset.file_format &&
+ !_range.table_format_params.paimon_params.file_format.empty()) {
+ const auto& split_file_format =
_range.table_format_params.paimon_params.file_format;
+ auto file_format_it = options.find(paimon::Options::FILE_FORMAT);
+ if (file_format_it == options.end() || file_format_it->second.empty())
{
+ options[paimon::Options::FILE_FORMAT] = split_file_format;
+ }
+ auto manifest_format_it =
options.find(paimon::Options::MANIFEST_FORMAT);
+ if (manifest_format_it == options.end() ||
manifest_format_it->second.empty()) {
+ options[paimon::Options::MANIFEST_FORMAT] = split_file_format;
+ }
+ }
Review Comment:
New option inference logic isn’t covered by existing PaimonCppReader unit
tests. Consider adding a test that asserts when split-level `file_format` is
set and `options` lacks/has empty `paimon::Options::FILE_FORMAT` /
`MANIFEST_FORMAT`, `_build_options()` (or an observable init path) populates
them, and does not override non-empty values.
##########
be/src/vec/exec/format/table/paimon_cpp_reader.cpp:
##########
@@ -311,6 +311,22 @@ std::map<std::string, std::string>
PaimonCppReader::_build_options() const {
copy_if_missing("fs.s3a.region", "AWS_REGION");
copy_if_missing("fs.s3a.path.style.access", "use_path_style");
+ // FE currently may not pass paimon table options in scan ranges.
+ // Avoid paimon-cpp defaulting manifest.format to avro when split file
format is known.
+ if (_range.__isset.table_format_params &&
_range.table_format_params.__isset.paimon_params &&
+ _range.table_format_params.paimon_params.__isset.file_format &&
+ !_range.table_format_params.paimon_params.file_format.empty()) {
+ const auto& split_file_format =
_range.table_format_params.paimon_params.file_format;
+ auto file_format_it = options.find(paimon::Options::FILE_FORMAT);
+ if (file_format_it == options.end() || file_format_it->second.empty())
{
+ options[paimon::Options::FILE_FORMAT] = split_file_format;
+ }
+ auto manifest_format_it =
options.find(paimon::Options::MANIFEST_FORMAT);
+ if (manifest_format_it == options.end() ||
manifest_format_it->second.empty()) {
+ options[paimon::Options::MANIFEST_FORMAT] = split_file_format;
+ }
+ }
Review Comment:
This behavior is slightly broader than the PR description (“only when table
options are missing/empty”): it will also set `FILE_FORMAT`/`MANIFEST_FORMAT`
when other table options are present but just these keys are absent/empty. If
that’s intended, the PR description should be updated; if not, consider
tightening the condition to only apply when the table-level paimon options map
is missing/empty.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]