ehurheap commented on issue #9079: URL: https://github.com/apache/hudi/issues/9079#issuecomment-1640664456
Yes, the workaround using the writeClient that we discussed in [slack](https://apache-hudi.slack.com/archives/C4D716NPQ/p1689111633808279?thread_ts=1687983367.526889&cid=C4D716NPQ) worked for me. Here is a summary: we build the writeClient: ``` def buildWriteClient(): SparkRDDWriteClient[_] = { val lockProperties = new Properties() // populate lockProperties as appropriate val metricsProperties = new Properties() // populate metricsProperties as appropriate val writerConfig = HoodieWriteConfig .newBuilder() .withCompactionConfig( HoodieCompactionConfig .newBuilder() .withInlineCompaction(true) .withScheduleInlineCompaction(false) .withMaxNumDeltaCommitsBeforeCompaction(1) .build() ) .withArchivalConfig(HoodieArchivalConfig.newBuilder().withAutoArchive(false).build()) .withCleanConfig(HoodieCleanConfig.newBuilder().withAutoClean(false).build()) .withMetadataConfig(HoodieMetadataConfig.newBuilder().enable(false).build()) .withLockConfig(HoodieLockConfig.newBuilder().fromProperties(lockProperties).build()) .withMetricsConfig(HoodieMetricsConfig.newBuilder().fromProperties(metricsProperties).build()) .withDeleteParallelism(config.deleteParallelism) .withPath(config.tablePath) .forTable(datalakeRecord.tableName) .build() val engineContext: HoodieEngineContext = new HoodieSparkEngineContext( JavaSparkContext.fromSparkContext(sparkContext) ) new SparkRDDWriteClient(engineContext, writerConfig) } ``` Then run delete and compaction for the specified keys: ``` var deleteInstant: String = "" try { deleteInstant = writeClient.startCommit() writeClient.delete(keysToDelete, deleteInstant) // :TRICKY: explicitly calling compaction here: although the write client was configured to auto compact in-line, compaction is not in fact triggered by this delete operation. val maybeCompactionInstant = writeClient.scheduleCompaction(org.apache.hudi.common.util.Option.empty()) if (maybeCompactionInstant.isPresent) writeClient.compact(maybeCompactionInstant.get) else log.warn( s"Unable to schedule compaction after delete operation at instant ${deleteInstant}" ) } catch { case t: Throwable => logErrorAndExit(s"Delete operation failed for instant ${deleteInstant} due to ", t) } finally { log.info(s"Finished delete operation for instant ${deleteInstant}") writeClient.close() } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org