[ https://issues.apache.org/jira/browse/HUDI-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404991#comment-17404991 ]
Yue Zhang edited comment on HUDI-2355 at 8/26/21, 6:41 AM: ----------------------------------------------------------- Actually, this problems does exist based to current master branch that cleaner happened first then archival executed. {code:java} protected void postCommit(HoodieTable<T, I, K, O> table, HoodieCommitMetadata metadata, String instantTime, Option<Map<String, String>> extraMetadata) { try { // Delete the marker directory for the instant. WriteMarkersFactory.get(config.getMarkersType(), table, instantTime) .quietDeleteMarkerDir(context, config.getMarkersDeleteParallelism()); // We cannot have unbounded commit files. Archive commits if we have to archive HoodieTimelineArchiveLog archiveLog = new HoodieTimelineArchiveLog(config, table); archiveLog.archiveIfRequired(context); if (operationType != null && operationType != WriteOperationType.CLUSTER && operationType != WriteOperationType.COMPACT) { syncTableMetadata(); } } catch (IOException ioe) { throw new HoodieIOException(ioe.getMessage(), ioe); } finally { this.heartbeatClient.stop(instantTime); } } {code} Even using async cleaner mode, the archival will not wait for async cleaner service finished and start to archive/delete commits. Just raise a PR to fix this problem to adjust the execution order that execute cleaner first then archive was (Author: zhangyue19921010): Actually, this problems does exist based to current master branch that cleaner happened first then archival executed. `protected void postCommit(HoodieTable<T, I, K, O> table, HoodieCommitMetadata metadata, String instantTime, Option<Map<String, String>> extraMetadata) { try { // Delete the marker directory for the instant. WriteMarkersFactory.get(config.getMarkersType(), table, instantTime) .quietDeleteMarkerDir(context, config.getMarkersDeleteParallelism()); // We cannot have unbounded commit files. Archive commits if we have to archive HoodieTimelineArchiveLog archiveLog = new HoodieTimelineArchiveLog(config, table); archiveLog.archiveIfRequired(context); if (operationType != null && operationType != WriteOperationType.CLUSTER && operationType != WriteOperationType.COMPACT) { syncTableMetadata(); } } catch (IOException ioe) { throw new HoodieIOException(ioe.getMessage(), ioe); } finally { this.heartbeatClient.stop(instantTime); } } ` Even using async cleaner mode, the archival will not wait for async cleaner service finished and start to archive/delete commits. Just raise a PR to fix this problem to adjust the execution order that execute cleaner first then archive > after clustering with archive meet data incorrect > -------------------------------------------------- > > Key: HUDI-2355 > URL: https://issues.apache.org/jira/browse/HUDI-2355 > Project: Apache Hudi > Issue Type: Sub-task > Reporter: liwei > Assignee: liwei > Priority: Major > Labels: pull-request-available > > after [https://github.com/apache/hudi/pull/3310] replace data file clean in > clean. but if replacecommit file deleted , in clean can not read the > datafile. -- This message was sent by Atlassian Jira (v8.3.4#803005)