Hi Zhangduo, Thank you for your response. I think your points are very valid. From a generalizability perspective, it is indeed difficult to optimize write amplification during flushes on S3. However, I believe that even so, for a more common architecture like HBase on HDFS, if the user base is large enough and the ROI is significant, the community might consider implementing specific optimizations for HDFS without applying the same optimizations on S3. After all, solving this issue could not only avoid generating a large number of small files when MOB is enabled, but also effectively reduce write amplification. If no such optimization has been made so far, does it mean that the user base is still not large enough?
If there is currently no optimization for write amplification during flushes with MOB, for a cluster with several hundred TB of data and without too many splits causing major compaction rewrites, would adjusting compaction parameters and reasonably scheduling peak and off-peak times achieve a similar result to enabling MOB, while avoiding the hassle of cleaning up small files every week? At the same time, it would eliminate concerns about the instability of MOB. What would be your suggestions on this? Best ---------------- Xinyu Tan On 2025/06/19 01:35:03 "张铎(Duo Zhang)" wrote: > I guess the problem is that only HDFS can support append, but HBase is > designed to store HFiles on lots of other types of storage, like S3. > > And keeping the MOB file open for write is also only suitable for > HDFS, as only HDFS can support hflush/hsync, for S3 like storage, a > file is only visible when it is closed. > > Anyway, the MOB feature is not designed and implemented by me, just my > thoughts on this area. I'm not sure whether there are people in the > community use it in production. > > Thanks. > > Xinyu Tan <tanxi...@apache.org> 于2025年6月19日周四 09:28写道: > > > > Hello everyone > > > > I am a developer from the IoTDB and Ratis communities, and I am familiar > > with distributed systems and storage engines. Recently, I have been > > studying the MOBV2 feature in HBase. > > > > I found that when hbase.mob.compaction.type is set to optimized, it is > > possible for multiple files, each not exceeding a specific threshold, to be > > generated in a single compaction. However, I also noticed that each time > > the memstore is flushed, it can generate a new mob hfile, and since the > > default flush threshold for each memstore is 128MB, many small MOB files > > are created. Given that the default merge period for mob files is one week, > > does this mean that these newly generated small MOB files have to wait a > > week before they can be merged into a larger file? I am not sure if my code > > interpretation is correct, so is this reasoning accurate? > > > > If this is the case, I am curious as to why large files in the mob region > > aren't reused across different flushes and switched after reaching a > > certain size. This approach doesn’t seem to have any downsides, but it > > could reduce write amplification. Single-node storage engines like > > Badger/Titan operate this way; otherwise, the merging of these small mob > > HFiles would still cause write amplification. Was there any specific > > consideration during the design that led to this approach? > > > > Additionally, I would like to understand the current state of the MOB > > feature and whether it has reached a production-ready level. > > > > Thank you! > > > > Best > > ------------------ > > Xinyu Tan >