On Wed, Oct 4, 2017 at 3:51 PM, Andrew Purtell <apurt...@apache.org> wrote:
> I think it is fine to rebrand these interfaces as for coprocessors and tag > them LP(COPROC): > > Region (use HRegion in internals) > > Store (use HStore in internals) > > MasterServices (use HMaster in internals) > > RegionServerServices (use HRegionServer in internals) > > and pare them down. This is not a complete list, just a handful of > examples. > > Will do. Just to say that it means a bunch of refactoring undoing reliance on Interfaces (We have too many services and managers!) > Some specific points of feedback I have had: > > In Region, it would be good for CPs to be able to schedule async flushes > and compactions, poll or wait for completion of a specific request, or wait > for all pending flushes and compactions to complete. There is a Phoenix use > case for this in indexing. > > The means are in place. Will make sure it happens. > Security coprocessors use MasterServices to create their system tables. > Maybe this can be replaced by using the normal admin API for same. > > Yes. Working on this too. > When removing access to internal services, consider if there are client API > equivalents that the CP can use, and if embedded calls to such client APIs > from the coprocessor context would be a good idea. CP invocation of an > internal service is simply an in-process method call. That's good and bad, > right? The bad part, direct access, is the thing we want to restrict, the > motivation for this work (in part). But the good thing is it avoids a lot > of the fat client logic unnecessary for all-in-process service invocation, > which might even not work correctly. Removing everything is as drastic as > allowing CPs access to everything. It could be fine to drastically pare > down, but please consider it. > > The short-circuit of RPC (and PB) we have where when a Client is on the target host, the invocation is direct helps here. Moving the RegionServerGroup feature over to the new cut-down Coprocessor API has turned up some holes in our Admin Interface -- i.e. there is no getServers method unless you go via ClusterReport -- that I'm backfilling so RSGroups can use Admin Interface instead of now-removed CP API. But also, have had to add back a few items to MasterServices in a more constrained form. > Some changes have been proposed that removes access to metrics (e.g. > RegionMetrics, MasterMetrics). Right now coprocessors can bypass core > function and replace it. Until and unless we remove the bypass semantic > (under discussion) we should continue to allow CPs access to metrics > objects so they can update metrics as expected by admins and users when > replacing functionality (via bypass). Metrics are a public facing API. I > agree this is kind of dodgy. I believe we should remove the bypass > semantic. Once that is done, coprocessors can only mix in additional > functionality. No more cause to touch core metrics. They can export their > own metrics if so desired. > > > Will be back to ping you on this one. Yeah, I think it dodgy CPs updating internal metrics. Will look at the radical removal of bypass too... Will be back. Thanks Andrew, S > On Wed, Oct 4, 2017 at 3:15 PM, Stack <st...@duboce.net> wrote: > > > A bunch of us are making good progress on the next alpha release, > > hbase-2.0.0-alpha-4. The theme for this release is "Fixing the > Coprocessor > > API", mostly undoing access accidentally granted Coprocessors. I'm > talking > > out loud about a particularly awkward item here rather than in a comment > up > > in JIRA so the airing sees a broader audience. Interested in any opinions > > or input you might have. > > > > TL;DR MasterService/RegionServerService and Region, etc., Interfaces > were > > overloaded serving two, disparate roles; a load of refactoring has to be > > done to undo the damage. Suggestions for how to avoid making same mistake > > in future? > > > > I'm working on "HBASE-12260 MasterServices - remove from coprocessor API > > (Discuss)". MasterServices started out as a subset of the Master > > functionality. The idea back then was that certain Services and Managers > > could make do w/ less-than-full-access to the HMaster process. If so, we > > could test the Service and Manager without having to standup a full > HMaster > > instance (This usually required our putting up a cluster too). If > > MasterServices had but a few methods, a Mock would be easy, making > testing > > easier still. We did the same thing on the RegionServer side where we had > > RegionServerServices. > > > > MasterServices (and RegionServerServices) were also exposed to > > Coprocessors. The idea was that CPs could ask-for particular Master > > functions via MasterService. In this role MasterServices was meant to > > constrain what CPs had access to. > > > > MasterServices therefore had two consumers; one internal, the other not. > > > > With time, MasterServices got fat as internal Services and Managers > needed > > more of HMaster. Everytime we added to MasterServices, CPs got access. > > > > On survey as part of the recent HBASE-12260 work, it turns out that the > > bulk of the methods in MasterServices are actually annotated > > InterfaceAudience.Private; i.e. for internal use only, not for > > Coprocessors. A brutal purge of Private audience objects, makes for a > > MasterServices we can pass Coprocessors but Coprocessors now have much > less > > facility available (for parts, there are alternatives; Andy review > suggests > > that CPs are severely crimped if this patch goes in). But MasterServices > > can no longer be used for its original purpose, passing Services and > > Managers a subset of HMaster. The latter brings on a substantial > refactor. > > > > Another example of the double-role problem outlined above was found by > Duo > > and Anoop in the RegionServer Coprocessor refactor salt mine. They hit a > > similar tangle. There was the RegionServerServices <=> MasterServices > case > > but also the exposure of HRegion internals. In this latter, Region was > > introduced by Andy EXPLICITLY as a subset of HRegion facility FOR > > Coprocessors. Subsequently, we all confused his original intent and went > > ahead and thought of Region (as opposed to HRegion) as an Interface for > > HRegion and plumbed it throughout the code base in place of explicit > > HRegion references. As Region picked up functionality, Coprocessors > gained > > more access. > > > > The refactoring pattern that has emerged out of the RegionServer-side > > refactoring (which is ahead of the Master-work), is that we move to use > the > > HRegion implementation everywhere internally, undoing use of Region > > Interface; Region Interface picks up a "FOR COPROCESSORS ONLY" stamp. I'm > > following suit on the Master side moving to use HMaster in place of > > MasterServices in all launched Services and Managers. > > > > How do we avoid this mistake in future? Should we rename Region as > > CoprocessorRegion so it more plain that its consumer is Coprocessors? > Ditto > > on MasterServices? > > > > Thanks, > > S > > > > > > -- > Best regards, > Andrew > > Words like orphans lost among the crosstalk, meaning torn from truth's > decrepit hands > - A23, Crosstalk >