ehurheap commented on issue #9079:
URL: https://github.com/apache/hudi/issues/9079#issuecomment-1640664456

   Yes, the workaround using the writeClient that we discussed in 
[slack](https://apache-hudi.slack.com/archives/C4D716NPQ/p1689111633808279?thread_ts=1687983367.526889&cid=C4D716NPQ)
 worked for me.
   
   Here is a summary:
   
   we build the writeClient:
   
     ```
   def buildWriteClient(): SparkRDDWriteClient[_] = {
   
       val lockProperties = new Properties() // populate lockProperties as 
appropriate
       val metricsProperties = new Properties() // populate metricsProperties 
as appropriate
   
       val writerConfig = HoodieWriteConfig
         .newBuilder()
         .withCompactionConfig(
           HoodieCompactionConfig
             .newBuilder()
             .withInlineCompaction(true)
             .withScheduleInlineCompaction(false)
             .withMaxNumDeltaCommitsBeforeCompaction(1)
             .build()
         )
         
.withArchivalConfig(HoodieArchivalConfig.newBuilder().withAutoArchive(false).build())
         
.withCleanConfig(HoodieCleanConfig.newBuilder().withAutoClean(false).build())
         
.withMetadataConfig(HoodieMetadataConfig.newBuilder().enable(false).build())
         
.withLockConfig(HoodieLockConfig.newBuilder().fromProperties(lockProperties).build())
         
.withMetricsConfig(HoodieMetricsConfig.newBuilder().fromProperties(metricsProperties).build())
         .withDeleteParallelism(config.deleteParallelism)
         .withPath(config.tablePath)
         .forTable(datalakeRecord.tableName)
         .build()
       val engineContext: HoodieEngineContext = new HoodieSparkEngineContext(
         JavaSparkContext.fromSparkContext(sparkContext)
       )
       new SparkRDDWriteClient(engineContext, writerConfig)
     }
   
   ```
   Then run delete and compaction for the specified keys:
   
   ```
       var deleteInstant: String = ""
       try {
         deleteInstant = writeClient.startCommit()
         writeClient.delete(keysToDelete, deleteInstant)
         // :TRICKY: explicitly calling compaction here: although the write 
client was configured to auto compact in-line, compaction is not in fact 
triggered by this delete operation.
         val maybeCompactionInstant =
           
writeClient.scheduleCompaction(org.apache.hudi.common.util.Option.empty())
         if (maybeCompactionInstant.isPresent) 
writeClient.compact(maybeCompactionInstant.get)
         else
           log.warn(
             s"Unable to schedule compaction after delete operation at instant 
${deleteInstant}"
           )
       } catch {
         case t: Throwable =>
           logErrorAndExit(s"Delete operation failed for instant 
${deleteInstant} due to ", t)
       } finally {
         log.info(s"Finished delete operation for instant ${deleteInstant}")
         writeClient.close()
       }
   
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to