[ https://issues.apache.org/jira/browse/HBASE-24749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448029#comment-17448029 ]
Wellington Chevreuil commented on HBASE-24749: ---------------------------------------------- We had favoured the design approach from HBASE-26067, and have been doing all necessary refactorings as subtasks to allow for a pluggable way of creating and tracking storefiles, so I think this Jira (and all its subtasks), is not relevant anymore. Can we close it as duplicate or abandoned, to avoid any confusions on what's still being worked? [~zhangduo] [~elserj] [~zyork] [~stack] > Direct insert HFiles and Persist in-memory HFile tracking > --------------------------------------------------------- > > Key: HBASE-24749 > URL: https://issues.apache.org/jira/browse/HBASE-24749 > Project: HBase > Issue Type: Umbrella > Components: Compaction, HFile > Affects Versions: 3.0.0-alpha-1 > Reporter: Tak-Lon (Stephen) Wu > Assignee: Tak-Lon (Stephen) Wu > Priority: Major > Labels: design, discussion, objectstore, storeFile, storeengine > Attachments: 1B100m-25m25m-performance.pdf, Apache HBase - Direct > insert HFiles and Persist in-memory HFile tracking.pdf > > > We propose a new feature (a new store engine) to remove the {{.tmp}} > directory used in the commit stage for common HFile operations such as flush > and compaction to improve the write throughput and latency on object stores. > Specifically for S3 filesystems, this will also mitigate read-after-write > inconsistencies caused by immediate HFiles validation after moving the > HFile(s) to data directory. > Please see attached for this proposal and the initial result captured with > 25m (25m operations) and 1B (100m operations) YCSB workload A LOAD and RUN, > and workload C RUN result. > The goal of this JIRA is to discuss with the community if the proposed > improvement on the object stores use case makes senses and if we miss > anything should be included. > Improvement Highlights > 1. Lower write latency, especially the p99+ > 2. Higher write throughput on flush and compaction > 3. Lower MTTR on region (re)open or assignment > 4. Remove consistent check dependencies (e.g. DynamoDB) supported by file > system implementation -- This message was sent by Atlassian Jira (v8.20.1#820001)