A bunch of us are making good progress on the next alpha release, hbase-2.0.0-alpha-4. The theme for this release is "Fixing the Coprocessor API", mostly undoing access accidentally granted Coprocessors. I'm talking out loud about a particularly awkward item here rather than in a comment up in JIRA so the airing sees a broader audience. Interested in any opinions or input you might have.
TL;DR MasterService/RegionServerService and Region, etc., Interfaces were overloaded serving two, disparate roles; a load of refactoring has to be done to undo the damage. Suggestions for how to avoid making same mistake in future? I'm working on "HBASE-12260 MasterServices - remove from coprocessor API (Discuss)". MasterServices started out as a subset of the Master functionality. The idea back then was that certain Services and Managers could make do w/ less-than-full-access to the HMaster process. If so, we could test the Service and Manager without having to standup a full HMaster instance (This usually required our putting up a cluster too). If MasterServices had but a few methods, a Mock would be easy, making testing easier still. We did the same thing on the RegionServer side where we had RegionServerServices. MasterServices (and RegionServerServices) were also exposed to Coprocessors. The idea was that CPs could ask-for particular Master functions via MasterService. In this role MasterServices was meant to constrain what CPs had access to. MasterServices therefore had two consumers; one internal, the other not. With time, MasterServices got fat as internal Services and Managers needed more of HMaster. Everytime we added to MasterServices, CPs got access. On survey as part of the recent HBASE-12260 work, it turns out that the bulk of the methods in MasterServices are actually annotated InterfaceAudience.Private; i.e. for internal use only, not for Coprocessors. A brutal purge of Private audience objects, makes for a MasterServices we can pass Coprocessors but Coprocessors now have much less facility available (for parts, there are alternatives; Andy review suggests that CPs are severely crimped if this patch goes in). But MasterServices can no longer be used for its original purpose, passing Services and Managers a subset of HMaster. The latter brings on a substantial refactor. Another example of the double-role problem outlined above was found by Duo and Anoop in the RegionServer Coprocessor refactor salt mine. They hit a similar tangle. There was the RegionServerServices <=> MasterServices case but also the exposure of HRegion internals. In this latter, Region was introduced by Andy EXPLICITLY as a subset of HRegion facility FOR Coprocessors. Subsequently, we all confused his original intent and went ahead and thought of Region (as opposed to HRegion) as an Interface for HRegion and plumbed it throughout the code base in place of explicit HRegion references. As Region picked up functionality, Coprocessors gained more access. The refactoring pattern that has emerged out of the RegionServer-side refactoring (which is ahead of the Master-work), is that we move to use the HRegion implementation everywhere internally, undoing use of Region Interface; Region Interface picks up a "FOR COPROCESSORS ONLY" stamp. I'm following suit on the Master side moving to use HMaster in place of MasterServices in all launched Services and Managers. How do we avoid this mistake in future? Should we rename Region as CoprocessorRegion so it more plain that its consumer is Coprocessors? Ditto on MasterServices? Thanks, S