Inline Geoffrey Jacoby <[email protected]> 于2021年7月21日周三 上午4:43写道:
> Thanks for creating the JIRA for adding the Configuration object. > > To make the discussion more concrete, I did an initial pass through many of > the minicluster tests at my dayjob, and here are the gaps I've found so > far. These don't count many cases where convenience methods on the HBTU are > used, but where the tests could be easily modified to use existing client > APIs instead. > > * An enormous number of both internal and Phoenix tests execute MapReduce > jobs (for example, Phoenix builds its secondary indexes via MapReduce). It > doesn't appear possible to run MapReduce jobs through the new testing > framework. (I filed HBASE-26102 for this.) > Usually, you do not need to start a mini MR client for running MR job, it will execute locally. Most HBase tests in hbase-mapreduce module do not start an actual mini MR cluster. And the current MiniMRCluster has already been deprecated in hadoop... > > * A test which uses two miniclusters which share the same mini-ZK quorum > under different znodes > What is the goal for these tests? In HBase, we do this when testing replication, but I think it is also OK to use different zk clusters? Unless there are problems when starting two zk clusters, then we should try to fix this problem. > > * An end-to-end test that verifies that a batch job that appends Cell Tags > to Delete markers it creates via a coproc in fact writes the Tags. This > requires getting a reference to the Region from the minicluster and > creating a RegionScanner because the HBase client APIs expressly prevent > reading Cell Tags. Supporting this would go against the > TestingHBaseCluster's design philosophy, so the old minicluster is still > necessary for it. > This is indeed a problem. We should find a way to let users verify the tags, as we do not leak it to client side. We could have something with IA.LP to expose the Region interface, maybe. > > So far, I think that with HBASE-26098 and HBASE-26102 the overwhelming > majority of internal tests would be fine with the new streamlined > TestingHBaseCluster. But not all of them, and there are valid reasons a > downstream developer might sometimes need the lower-level control the HBTU > gives. > > That's why I still believe that the premise that only HBase and Phoenix > need to directly test server-side components is incorrect, because > _downstream developers can create server-side components_ like coprocs and > replication endpoints. > > My suggested solution: > 1. Expose the inner HBaseTestingUtil in TestingHBaseClusterImpl on the > TestingHBaseCluster interface as getTestingUtil() > 2. Change the InterfaceAudience for HBaseTestingUtil and its related > classes to IA.LimitedPrivate(COPROC, REPLICATION) to indicate that these > are intended for limited use cases testing server-side integration logic. > As Phoenix is primarily a series of coprocs, that would apply to it as > well. > But in general I do not like these ideas. The HBTU should be IA.Private, in fact, it is internal to HBase, and exposes bunch of IA.Private classes. We could design some new interfaces with IA.LimitedPrivate, but not the HBTU itself. > > So then you have TestingHBaseCluster for the very common, easy cases, but > it's simple to get more control during rare-but-important hard cases. > > Geoffrey > > > > > > On Tue, Jul 20, 2021 at 12:17 AM 张铎(Duo Zhang) <[email protected]> > wrote: > > > Oh, sorry... Let me copy the release note here. It is more clear. > > > > Copy HBaseCommonTestingUtility, HBaseZKTestingUtility, > HBaseTestingUtility, > > > HBaseCluster, MiniHBaseCluster, StartMiniClusterOption to > > > hbase-testing-util, and mark them as Deprecated. They will be removed > in > > > 4.0.0. End users should use TestingHBaseCluster for writing UTs in the > > > future. > > > > > > And in hbase-server and related modules, these classes are renamed. > > > HBaseCommonTestingUtility -> HBaseCommonTestingUtil > > > HBaseZKTestingUtility -> HBaseZKTestingUtil > > > HBaseTestingUtility -> HBaseTestingUtil > > > HBaseCluster -> HBaseClusterInterface > > > MiniHBaseCluster -> SingleProcessHBaseCluster > > > StartMiniClusterOption -> StartTestingClusterOption > > > And all these classes are marked as IA.LimitedPrivate("Phoenix"), and > > > IS.Evolving. Except Phoenix, other end users should not use them any > > more. > > > > > > The new HBaseTestingUtil is still IA.LP("Phoenix"), as I think Phoenix > may > > still need to test something internal to HBase. But let's try to > > improvement the new TestingHBaseCluster interface to see if finally we > > could also not depend on HBTU directly in Phoenix. > > > > Thanks. > > > > Andrew Purtell <[email protected]> 于2021年7月20日周二 上午11:42写道: > > > > > It wasn’t clear until your response that HTBU was retained. The NOTICE > > > didn’t mention it and gave the impression it was removed. (“Renamed”) > > > > > > Glad to hear HBTU was deprecated, marked LP, and copied to src/. > Resolves > > > any concerns I might have had. > > > > > > > > > > On Jul 19, 2021, at 6:56 PM, 张铎 <[email protected]> wrote: > > > > > > > > Please see my last sentence in the first email > > > > > > > > We can keep improving it if the current API set is not enough. > > > >> > > > > > > > > If you have any concern on the current API design, for example, does > > not > > > > support passing a Configuration when constructing, then just open an > > > issue, > > > > we can add the support, no problem. > > > > > > > > And the original HBTU is still marked as > IA.LimitedPrivate("Phoenix"), > > > the > > > > phoenix project can still use it if the new TestingHBaseCluster is > not > > > > enough. > > > > > > > > And the deprecated HBTU will keep for the whole 3.x lifecycle, we > > copied > > > it > > > > to the hbase-testing-util module, under the src/main directory. You > can > > > > still use it for several years... > > > > > > > > Thanks. > > > > > > > > Andrew Purtell <[email protected]> 于2021年7月20日周二 上午3:26写道: > > > > > > > >>> Just leaving a reference to the old, lower-level HBTU as a public > > > >> property of the new > > > >>> interface seems lower-risk to me. What are the gains from hiding > the > > > >>> existing HBTU? > > > >> > > > >> This would be similar to the strategy we adopted for the Admin > > > >> interface. Admin there for new users, and as a migration target, but > > > >> HBaseAdmin still available with deprecation annotations. > > > >> > > > >> I guess the question is if the consensus is HBTU was meant to be > > > >> adopted and consumed by downstream projects. > > > >> > > > >> In my opinion, nothing in any project's test/* source directories > > > >> should be considered public, supported, and supportable. Test > > > >> resources within a project exist to test that project, including its > > > >> private internals. > > > >> > > > >> What I would recommend, fwiw is two things: > > > >> > > > >> 1. Explicitly release a supported hbase-testing-util artifact, with > > > >> Public and LimitedPrivate interfaces, with code in src/ not test/. > > > >> 2. Bring back HBTU, but as a compatibility shim. Provide deprecated > > > >> access to HBTU for Phoenix, marked LP(Phoenix), with this deprecated > > > >> accessor to be removed in HBase 4.0, along with the HBTU interface. > > > >> > > > >>> On Mon, Jul 19, 2021 at 12:15 PM Geoffrey Jacoby < > [email protected] > > > > > > >>> wrote: > > > >>> > > > >>> Can this be a [DISCUSS] rather than a [NOTICE]? The implications > for > > > >>> downstream projects (both Phoenix and many internal projects) are > > > large, > > > >>> and it seems like something that needs broader discussion before > > being > > > >> set > > > >>> in stone. The HBaseTestingUtility is used extensively in Phoenix, > as > > > well > > > >>> as in many internal projects at my dayjob (some directly and some > > > through > > > >>> Phoenix's BaseTest wrapping of HBTU) -- it's quite useful. > > > >>> > > > >>> The idea of a better-encapsulated, easier-to-use HBase testing > > utility > > > >> is a > > > >>> good one, and the TestingHBaseCluster interface looks like a > definite > > > >>> improvement. However, I notice at least one large gap right away: > > there > > > >>> doesn't appear to be a way to inject a custom Configuration object > > into > > > >> the > > > >>> test cluster, which is a very common pattern. (Example: run a test > > > suite > > > >>> twice with a new minicluster each time, once with a flag off and > then > > > >> on.) > > > >>> This seems like a simple fix. > > > >>> > > > >>> More concerning is the underlying assumption of the change, that > only > > > the > > > >>> HBase project, and perhaps Phoenix, will ever need to write a test > of > > > >>> server-side components. That's simply not the case, because HBase > has > > > >> many > > > >>> integration points that allow downstream developed code to run in > > > >>> server-side processes. > > > >>> > > > >>> These include: > > > >>> Coprocessor Observers and Endpoints > > > >>> Replication Endpoints > > > >>> MapReduce integration (which acts as a client from HBase's > > perspective > > > >> but > > > >>> runs within YARN services) > > > >>> > > > >>> In addition, Phoenix supports user-defined functions (UDFs) which I > > > >> believe > > > >>> can run server-side within a coproc in certain query plans. > > > >>> > > > >>> The change assumes that no one will ever need direct access to the > > > >> testing > > > >>> utility's internal ZooKeeper, MR, or DFS services, but this seems > > > >> relevant > > > >>> to failure scenario tests of both Replication Endpoints and > MapReduce > > > >> jobs. > > > >>> The Admin API may be able to replace quite a lot of existing logic > > > going > > > >>> forward, and many existing tests already use it rather than the > test > > > >>> utility directly. But there are literally thousands of downstream > > > tests > > > >> to > > > >>> analyze across many different organizations and institutions to > > verify > > > >> that > > > >>> nothing important is being lost, and that will take time. Just > > leaving > > > a > > > >>> reference to the old, lower-level HBTU as a public property of the > > new > > > >>> interface seems lower-risk to me. What are the gains from hiding > the > > > >>> existing HBTU? > > > >>> > > > >>> Geoffrey > > > >>> > > > >>> > > > >>> > > > >>> On Sun, Jul 18, 2021 at 9:44 PM 张铎(Duo Zhang) < > [email protected] > > > > > > >> wrote: > > > >>> > > > >>>> Please see the discussion in > > > >>>> https://issues.apache.org/jira/browse/HBASE-13126 > > > >>>> > > > >>>> And final work is done in > > > >>>> https://issues.apache.org/jira/browse/HBASE-26081 > > > >>>> https://github.com/apache/hbase/pull/3478 > > > >>>> > > > >>>> The original HBaseTestingUtility has been renamed to > > HBaseTestingUtil, > > > >> and > > > >>>> MiniHBaseCluster has been renamed to SingleProcessHBaseCluster. > Now > > > >> they > > > >>>> are not expected to be used by end users any more. We marked it as > > > >>>> IA.LimitedPrivate("Phoenix"), as maybe the Phoenix project may > still > > > >> need > > > >>>> to test something internal to HBase. > > > >>>> > > > >>>> Anyway, we encourage every downstream projects(including Phoenix) > to > > > >> try to > > > >>>> make use of the new TestingHBaseCluster introduced in > > > >>>> https://issues.apache.org/jira/browse/HBASE-26080 > > > >>>> > > > >>>> We can keep improving it if the current API set is not enough. > > > >>>> > > > >>>> ==== 简略的中文版通知,非直译 ==== > > > >>>> > > > >>>> HBaseTestingUtility 已经在 3.0.0 中被标记为 Deprecated,请所有用户尽量尝试使用在 > > > HBASE-26080 > > > >>>> 中引入的 TestingHBaseCluster。有任何需求请随时反馈,我们会持续优化。 > > > >>>> > > > >> > > > >> > > > >> > > > >> -- > > > >> Best regards, > > > >> Andrew > > > >> > > > >> Words like orphans lost among the crosstalk, meaning torn from > truth's > > > >> decrepit hands > > > >> - A23, Crosstalk > > > >> > > > > > >
