bvaradar commented on issue #1852: URL: https://github.com/apache/hudi/issues/1852#issuecomment-663427905
What do you mean by "runs serially with ingestion"? My understanding was that inline compaction happened in the same flow as writing so an inline compaction would simply slow down ingestion. ===> Yes, that is what I meant. Inline Compaction would run after ingestion but not in parallel. You can use #1752 to have it run concurrently. Does INLINE_COMPACT_NUM_DELTA_COMMITS_PROP refer to the number of commits retained in general, or the number of commits for a record? ==> INLINE_COMPACT_NUM_DELTA_COMMITS_PROP refers to number of ingestion (deltacommits) between 2 compaction runs. I see in the timeline I have several clean.requested and clean.inflight, how can I get these to actually complete? ==> If it is in inflight state alone, there could be errors when Hudi is trying to cleanup. Please look for exceptions in driver logs. Cleaner run should be run automatically by default. Also, any pending clean operations will automatically get picked up in next ingestion. So, it must have been failing for some reasons. You can turn on logs to see what is happening. Is it possible to force a compaction of the existing log files. ===> Yes, by configuring INLINE_COMPACT_NUM_DELTA_COMMITS_PROP. You can set it to 1 to have aggressive compaction. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org