+1 overall On Sat, Mar 19, 2022 at 5:02 PM Surya Prasanna <[email protected]> wrote:
> Hi Sagar, > Sorry for the delay in response. Thanks for the questions. > > 1. Trying to understand the main goal. Is it to balance the tradeoff > between read and write amplification for metadata table? Or is it purely to > optimize for reads? > > On large tables, write amplification is a side effect of frequent > compactions. So, instead of increasing the frequency of full compaction, we > are proposing minor compaction(LogCompaction) to be done frequently to > merge only the log blocks and write a new log block. By merging the blocks, > there are less no. of blocks to deal with during read, that way we are > optimizing for read performance and potentially avoiding the write > amplification problem. > > 2. Why do we need a separate action? Why can't any of the existing > compaction strategies (or a new one if needed) help to achieve this? > > A new compaction strategy can be added, but we thought it might > complicate the existing logic and need to rely on some hacks, especially > since Compaction action writes to a base file and places a .commit file > upon completion. Whereas, in our use case we are not concerned with the > base file at all, instead we are merging blocks and writing back to the log > file. So, we thought it is better to use a new action(called > LogCompaction), which works at a log file level and writes back to the log > file. Since log files are in general added by deltacommit, upon completion > LogCompaction can place a .deltacommit. > > 3. Is the proposed LogCompaction a replacement for regular compaction for > metadata table i.e. if LogCompaction is enabled then compaction cannot be > done? > > LogCompaction is not a replacement for regular compaction. LogCompaction > is performed as a minor compaction so as to reduce the no. of log blocks to > consider. It does not consider base files while merging the log blocks. To > merge log files with base file Compaction action is still needed. By using > LogCompaction action frequently, the frequency with which we do full scale > compaction is reduced. > Consider a scenario in which, after 'X' no. of LogCompaction actions, for > some file groups the log files size becomes comparable to that of base file > size, in this scenario LogCompaction action is going to take close to the > same amount of time as compaction action. So, now instead of LogCompaction, > full scale Compaction needs to be performed on those file groups. In future > we can also introduce logic to determine what is the right > action(Compaction or LogCompaction) to be performed depending on the state > of the file group. > > Thanks, > Surya > > > On Fri, Mar 18, 2022 at 11:22 PM Surya Prasanna Yalla <[email protected]> > wrote: > > > > > > > ---------- Forwarded message --------- > > From: sagar sumit <[email protected]> > > Date: Wed, Mar 16, 2022 at 11:17 PM > > Subject: Re: [DISCUSS] New RFC to create LogCompaction action for MOR > > tables? > > To: <[email protected]> > > > > > > Hi Surya, > > > > This is a very interesting idea! I'll be looking forward to RFC. > > > > I have a few high-level questions: > > > > 1. Trying to understand the main goal. Is it to balance the tradeoff > > between read and write amplification for metadata table? Or is it purely > to > > optimize for reads? > > 2. Why do we need a separate action? Why can't any of the existing > > compaction strategies (or a new one if needed) help to achieve this? > > 3. Is the proposed LogCompaction a replacement for regular compaction for > > metadata table i.e. if LogCompaction is enabled then compaction cannot be > > done? > > > > Regards, > > Sagar > > > > On Thu, Mar 17, 2022 at 12:51 AM Surya Prasanna < > > [email protected]> > > wrote: > > > > > Hi Team, > > > > > > > > > Record level index uses a metadata table which is a MOR table type. > > > > > > Each delta commit in the metadata table creates multiple hfile log > blocks > > > and so to read them multiple file handles have to be opened which might > > > cause issues in read performance. To reduce the read performance, > > > compaction can be run frequently which basically merges all the log > > blocks > > > to base file and creates another version of base file. If this is done > > > frequently, it would cause write amplification. > > > > > > Instead of merging all the log blocks to base file and doing a full > > > compaction, minor compaction can be done which basically merges log > > blocks > > > and creates one new log block. > > > > > > This can be achieved by adding a new action to Hudi called > LogCompaction > > > and requires a RFC. Please let me know what you think. > > > > > > > > > Thanks, > > > > > > Surya > > > > > >
