hudi-bot opened a new issue, #16636:
URL: https://github.com/apache/hudi/issues/16636
We only try to use the merger if parquet log blocks are set currently. The
change will probably look something like:
{code:java}
public HoodieRecordMerger getRecordMerger() {
List<String> mergers = StringUtils.split(getString(RECORD_MERGER_IMPLS),
",").stream()
.map(String::trim)
.distinct()
.collect(Collectors.toList());
return getRecordMerger(getString(BASE_PATH), getRecordMergeMode(),
engineType,
getLogDataBlockFormat(), mergers,
getStringOpt(RECORD_MERGER_STRATEGY), getTableType());
} public static HoodieRecordMerger getRecordMerger(String basePath,
RecordMergeMode mergeMode,
EngineType engineType,
HoodieLogBlock.HoodieLogBlockType logBlockType,
List<String> mergers,
Option<String> strategy,
HoodieTableType
tableType) {
if (tableType == HoodieTableType.COPY_ON_WRITE) {
return getRecordMergerBasedOnMergeMode(basePath, mergeMode,
engineType, mergers, strategy);
}
switch (logBlockType) {
case AVRO_DATA_BLOCK:
case HFILE_DATA_BLOCK:
return HoodieAvroRecordMerger.INSTANCE;
case PARQUET_DATA_BLOCK:
return getRecordMergerBasedOnMergeMode(basePath, mergeMode,
engineType, mergers, strategy);
default:
throw new IllegalStateException("This log block type is not
implemented");
}
} private static HoodieRecordMerger
getRecordMergerBasedOnMergeMode(String basePath,
RecordMergeMode mergeMode,
EngineType engineType,
List<String> mergers,
Option<String> strategy) {
//TODO: [HUDI-8202] make this custom mergers only
switch (mergeMode) {
case EVENT_TIME_ORDERING:
switch (engineType) {
case SPARK:
return
HoodieRecordUtils.loadRecordMerger("org.apache.hudi.DefaultSparkRecordMerger");
default:
return HoodieRecordUtils.createRecordMerger(basePath,
engineType, mergers, strategy);
}
case OVERWRITE_WITH_LATEST:
switch (engineType) {
case SPARK:
return
HoodieRecordUtils.loadRecordMerger("org.apache.hudi.OverwriteWithLatestSparkRecordMerger");
default:
return HoodieRecordUtils.createRecordMerger(basePath,
engineType, mergers, strategy);
}
case CUSTOM:
default:
return HoodieRecordUtils.createRecordMerger(basePath, engineType,
mergers, strategy);
}
} {code}
But the bulk of the work will be addressing any issues that arise from the
change
## JIRA info
- Link: https://issues.apache.org/jira/browse/HUDI-8259
- Type: Improvement
- Epic: https://issues.apache.org/jira/browse/HUDI-6243
- Fix version(s):
- 1.1.0
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]