[ https://issues.apache.org/jira/browse/HUDI-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Guo updated HUDI-6120: ---------------------------- Fix Version/s: 1.0.0 > fetchAllLogsMergedFileSlice will read basefile which it does not expect > ----------------------------------------------------------------------- > > Key: HUDI-6120 > URL: https://issues.apache.org/jira/browse/HUDI-6120 > Project: Apache Hudi > Issue Type: Improvement > Reporter: Jianhui Dong > Priority: Major > Labels: pull-request-available > Fix For: 0.14.0, 1.0.0 > > > Check the code snippet of > org.apache.hudi.common.table.view.AbstractTableFileSystemView#fetchAllLogsMergedFileSlice: > {code:java} > private Option<FileSlice> fetchAllLogsMergedFileSlice(HoodieFileGroup > fileGroup, String maxInstantTime) { > List<FileSlice> fileSlices = > fileGroup.getAllFileSlicesBeforeOn(maxInstantTime).collect(Collectors.toList()); > if (fileSlices.size() == 0) { > return Option.empty(); > } > if (fileSlices.size() == 1) { > return Option.of(fileSlices.get(0)); > } > final FileSlice latestSlice = fileSlices.get(0); > FileSlice merged = new FileSlice(latestSlice.getPartitionPath(), > latestSlice.getBaseInstantTime(), > latestSlice.getFileId()); > // add log files from the latest slice to the earliest > fileSlices.forEach(slice -> > slice.getLogFiles().forEach(merged::addLogFile)); > return Option.of(merged); > }{code} > if we only fetch one file slice, we will return the file slice with basefile, > and then hudi-flink will create a SkipMergeIterator/MergeIterator which both > reads basefile and logfiles for the split. -- This message was sent by Atlassian Jira (v8.20.10#820010)