[ https://issues.apache.org/jira/browse/HBASE-25714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791761#comment-17791761 ]
Andrew Kyle Purtell edited comment on HBASE-25714 at 11/30/23 8:19 PM: ----------------------------------------------------------------------- [~vjasani] [~niuyulin] I spent a couple of days a while back backporting these changes to a local tree based on branch-2. I got as far as subtask 4, decided it could be done when and if needed, and moved on to other work. The commits on the dev branch can serve as an implementation guide for redo of the work in the current master code base. If you are thinking about moving forward, the subtasks on this issue do remain a good implementation plan. Many hunks in the patches apply even on branch-2 so can guide as well, although should not be applied directly. However compatibility and rolling upgrade concerns are underspecified in the design doc and in the "implementation guide", as I will call this umbrella. Reading through the design doc then, and again now, I think some changes to the design should be considered: - In the new design the regionserver, in order to initiate a compaction, makes a request to the master (_requestCompaction_), which then forwards the request to the compaction server. The compaction server, when complete, then sends completion notice directly to the RS. This doesn't seem ideal. Why does the master need to be involved at all? The RS can contact one of the compaction servers directly and make the request? Or, ok, say we do decide the master should be involved in making region level decisions about compaction, which would be a new architecture, why does the completion notice not get sent back to the master? The new flow should be symmetric. Stated reason for this in the discussion on the design doc is maybe the master is not available and does not need to be involved in the path of the completion notice. Exactly. My advice: Do not involve the master at all. Not even to track what compaction servers are available or not. We can use a znode in zookeeper to track the live set of compaction servers instead. Have the compaction servers register themselves as ephemeral znodes under {{/hbase/compactionServers}} or similar and the regionservers can keep a watch on this and manage a local list of available compaction servers themselves. - Coprocessor API changes should be backwards compatible. Introducing a new abstract types in the middle of the hierarchy will cause compatibility breaks. Do we need these changes? The compaction server can provide _RegionCoprocessorHost_ and _RegionServerServices_ objects too. Many methods should throw _UnsupportedOperationException_ and this can be fine. Coprocessor implementors would need to adapt their logic based on whether or not an exception is thrown. They can interact with the region by _Connection_ instead of _Region_/_HRegion_ directly. And we can add a convenience marker interface so coprocessor developers can check if they are operating in a regionserver or in a compaction server at init time. I think this can be done in a compatible way. That said, downstream coprocessor users like Apache Phoenix, will need to update their design to account for potential compaction offload. They will not be able to rely on shared state with the regionserver process, because it will be a separate process. was (Author: apurtell): [~vjasani] [~niuyulin] I spent a couple of days a while back backporting these changes to a local tree based on branch-2. I got as far as subtask 4, decided it could be done when and if needed, and moved on. Even on master branch there are divergences from the dev branch due to changes over time. If you are thinking about moving forward, the subtasks on this issue are still a good implementation plan. Many hunks in the patches apply even on branch-2 so can guide as well, although should not be applied directly. However compatibility and rolling upgrade concerns are underspecified in the design doc and in the "implementation guide", as I will call this umbrella. Reading through the design doc then, and again now, I think some changes to the design should be considered: - In the new design the regionserver, in order to initiate a compaction, makes a request to the master (_requestCompaction_), which then forwards the request to the compaction server. The compaction server, when complete, then sends completion notice directly to the RS. This doesn't seem ideal. Why does the master need to be involved at all? The RS can contact one of the compaction servers directly and make the request? Or, ok, say we do decide the master should be involved in making region level decisions about compaction, which would be a new architecture, why does the completion notice not get sent back to the master? The new flow should be symmetric. Stated reason for this in the discussion on the design doc is maybe the master is not available and does not need to be involved in the path of the completion notice. Exactly. My advice: Do not involve the master at all. Not even to track what compaction servers are available or not. We can use a znode in zookeeper to track the live set of compaction servers instead. Have the compaction servers register themselves as ephemeral znodes under {{/hbase/compactionServers}} or similar and the regionservers can keep a watch on this and manage a local list of available compaction servers themselves. - Coprocessor API changes should be backwards compatible. Introducing a new abstract types in the middle of the hierarchy will cause compatibility breaks. Do we need these changes? The compaction server can provide _RegionCoprocessorHost_ and _RegionServerServices_ objects too. Many methods should throw _UnsupportedOperationException_ and this can be fine. Coprocessor implementors would need to adapt their logic based on whether or not an exception is thrown. They can interact with the region by _Connection_ instead of _Region_/_HRegion_ directly. And we can add a convenience marker interface so coprocessor developers can check if they are operating in a regionserver or in a compaction server at init time. I think this can be done in a compatible way. That said, downstream coprocessor users like Apache Phoenix, will need to update their design to account for potential compaction offload. They will not be able to rely on shared state with the regionserver process, because it will be a separate process. > Offload the compaction job to independent Compaction Server > ----------------------------------------------------------- > > Key: HBASE-25714 > URL: https://issues.apache.org/jira/browse/HBASE-25714 > Project: HBase > Issue Type: Umbrella > Reporter: Yulin Niu > Assignee: Yulin Niu > Priority: Major > Attachments: CoprocessorSupport1.png, CoprocessorSupport2.png > > > The basic idea is add a role "CompactionServer" to take the Compaction job. > HMaster is responsible for scheduling the compaction job to different > CompactionServer. > [design > doc|https://docs.google.com/document/d/1exmhQpQArAgnryLaV78K3260rKm64BHBNzZE4VdTz0c/edit?usp=sharing] > Suggestions are welcomed. Thanks. -- This message was sent by Atlassian Jira (v8.20.10#820010)