It would of course be possible to solve any particular gap with a special
LP method or interface. I see three problems with that:

1. These special-purpose methods would increase in number over time, and
make the new interface less clean and more cluttered. I really like how
clean it is right now.
2. This requires us to know in advance what special-purpose methods are
needed. There are probably other developers out there with a very different
set of testing needs than mine that I'd never think of. This is
particularly difficult in an Apache-licensed project, where APIs are built
in public but often consumed in private organizations.
3. Some developers who need special-purpose LP methods won't be paying
attention to this thread, or only realize a gap later when building a new
feature, and will have to file a JIRA and either wait for the next HBase
release or just use the old minicluster anyway. That's not a good developer
experience.

Providing a normal mode (the new interface) and an "expert" mode (the
pointer to the inner HBaseTestingUtil for server-side code testing) keeps
the interface clean with just a single additional accessor, and allows
developers to adopt the new interface with confidence, knowing that if
there's a gap they can always work around it easily without a JIRA. Marking
the expert mode LP(COPROC, REPLICATION) makes it clear that it _is_ an
expert mode, just for server-side testing.

Or if you prefer, there could be a ServerTestingHBaseCluster interface that
extends TestingHBaseCluster and adds the extra HBTU accessor, with an
LP(COPROC, REPLICATION).

Could you please explain more about what you don't like about exposing HBTU
even for limited cases? There seems to be an assumption that downstream
developers will follow directions for an IA.Private on HBTU by not using
it, but not follow directions for IA.Private annotations on classes
reachable from HBTU?

Thanks,

Geoffrey



On Wed, Jul 21, 2021 at 1:21 AM 张铎(Duo Zhang) <[email protected]> wrote:

> Inline
>
> Geoffrey Jacoby <[email protected]> 于2021年7月21日周三 上午4:43写道:
>
> > Thanks for creating the JIRA for adding the Configuration object.
> >
> > To make the discussion more concrete, I did an initial pass through many
> of
> > the minicluster tests at my dayjob, and here are the gaps I've found so
> > far. These don't count many cases where convenience methods on the HBTU
> are
> > used, but where the tests could be easily modified to use existing client
> > APIs instead.
> >
> > * An enormous number of both internal and Phoenix tests execute MapReduce
> > jobs (for example, Phoenix builds its secondary indexes via MapReduce).
> It
> > doesn't appear possible to run MapReduce jobs through the new testing
> > framework. (I filed HBASE-26102 for this.)
> >
> Usually, you do not need to start a mini MR client for running MR job, it
> will execute locally. Most HBase tests in hbase-mapreduce module do not
> start an actual mini MR cluster.
> And the current MiniMRCluster has already been deprecated in hadoop...
>
> >
> > * A test which uses two miniclusters which share the same mini-ZK quorum
> > under different znodes
> >
> What is the goal for these tests? In HBase, we do this when testing
> replication, but I think it is also OK to use different zk clusters? Unless
> there are problems when starting two zk clusters, then we should try to fix
> this problem.
>
> >
> > * An end-to-end test that verifies that a batch job that appends Cell
> Tags
> > to Delete markers it creates via a coproc in fact writes the Tags. This
> > requires getting a reference to the Region from the minicluster and
> > creating a RegionScanner because the HBase client APIs expressly prevent
> > reading Cell Tags. Supporting this would go against the
> > TestingHBaseCluster's design philosophy, so the old minicluster is still
> > necessary for it.
> >
> This is indeed a problem. We should find a way to let users verify the
> tags, as we do not leak it to client side. We could have something with
> IA.LP to expose the Region interface, maybe.
>
> >
> > So far, I think that with HBASE-26098 and HBASE-26102 the overwhelming
> > majority of internal tests would be fine with the new streamlined
> > TestingHBaseCluster. But not all of them, and there are valid reasons a
> > downstream developer might sometimes need the lower-level control the
> HBTU
> > gives.
> >
> > That's why I still believe that the premise that only HBase and Phoenix
> > need to directly test server-side components is incorrect, because
> > _downstream developers can create server-side components_ like coprocs
> and
> > replication endpoints.
> >
> > My suggested solution:
> > 1. Expose the inner HBaseTestingUtil in TestingHBaseClusterImpl on the
> > TestingHBaseCluster interface as getTestingUtil()
> > 2. Change the InterfaceAudience for HBaseTestingUtil and its related
> > classes to IA.LimitedPrivate(COPROC, REPLICATION) to indicate that these
> > are intended for limited use cases testing server-side integration logic.
> > As Phoenix is primarily a series of coprocs, that would apply to it as
> > well.
> >
> But in general I do not like these ideas. The HBTU should be IA.Private, in
> fact, it is internal to HBase, and exposes bunch of IA.Private classes.
> We could design some new interfaces with IA.LimitedPrivate, but not the
> HBTU itself.
>
> >
> > So then you have TestingHBaseCluster for the very common, easy cases, but
> > it's simple to get more control during rare-but-important hard cases.
> >
> > Geoffrey
> >
> >
> >
> >
> >
> > On Tue, Jul 20, 2021 at 12:17 AM 张铎(Duo Zhang) <[email protected]>
> > wrote:
> >
> > > Oh, sorry... Let me copy the release note here. It is more clear.
> > >
> > > Copy HBaseCommonTestingUtility, HBaseZKTestingUtility,
> > HBaseTestingUtility,
> > > > HBaseCluster, MiniHBaseCluster, StartMiniClusterOption to
> > > > hbase-testing-util, and mark them as Deprecated. They will be removed
> > in
> > > > 4.0.0. End users should use TestingHBaseCluster for writing UTs in
> the
> > > > future.
> > > >
> > > > And in hbase-server and related modules, these classes are renamed.
> > > > HBaseCommonTestingUtility -> HBaseCommonTestingUtil
> > > > HBaseZKTestingUtility -> HBaseZKTestingUtil
> > > > HBaseTestingUtility -> HBaseTestingUtil
> > > > HBaseCluster -> HBaseClusterInterface
> > > > MiniHBaseCluster -> SingleProcessHBaseCluster
> > > > StartMiniClusterOption -> StartTestingClusterOption
> > > > And all these classes are marked as IA.LimitedPrivate("Phoenix"), and
> > > > IS.Evolving. Except Phoenix, other end users should not use them any
> > > more.
> > >
> > >
> > >  The new HBaseTestingUtil is still IA.LP("Phoenix"), as I think Phoenix
> > may
> > > still need to test something internal to HBase. But let's try to
> > > improvement the new TestingHBaseCluster interface to see if finally we
> > > could also not depend on HBTU directly in Phoenix.
> > >
> > > Thanks.
> > >
> > > Andrew Purtell <[email protected]> 于2021年7月20日周二 上午11:42写道:
> > >
> > > > It wasn’t clear until your response that HTBU was retained. The
> NOTICE
> > > > didn’t mention it and gave the impression it was removed. (“Renamed”)
> > > >
> > > > Glad to hear HBTU was deprecated, marked LP, and copied to src/.
> > Resolves
> > > > any concerns I might have had.
> > > >
> > > >
> > > > > On Jul 19, 2021, at 6:56 PM, 张铎 <[email protected]> wrote:
> > > > >
> > > > > Please see my last sentence in the first email
> > > > >
> > > > > We can keep improving it if the current API set is not enough.
> > > > >>
> > > > >
> > > > > If you have any concern on the current API design, for example,
> does
> > > not
> > > > > support passing a Configuration when constructing, then just open
> an
> > > > issue,
> > > > > we can add the support, no problem.
> > > > >
> > > > > And the original HBTU is still marked as
> > IA.LimitedPrivate("Phoenix"),
> > > > the
> > > > > phoenix project can still use it if the new TestingHBaseCluster is
> > not
> > > > > enough.
> > > > >
> > > > > And the deprecated HBTU will keep for the whole 3.x lifecycle, we
> > > copied
> > > > it
> > > > > to the hbase-testing-util module, under the src/main directory. You
> > can
> > > > > still use it for several years...
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Andrew Purtell <[email protected]> 于2021年7月20日周二 上午3:26写道:
> > > > >
> > > > >>> Just leaving a reference to the old, lower-level HBTU as a public
> > > > >> property of the new
> > > > >>> interface seems lower-risk to me. What are the gains from hiding
> > the
> > > > >>> existing HBTU?
> > > > >>
> > > > >> This would be similar to the strategy we adopted for the Admin
> > > > >> interface. Admin there for new users, and as a migration target,
> but
> > > > >> HBaseAdmin still available with deprecation annotations.
> > > > >>
> > > > >> I guess the question is if the consensus is HBTU was meant to be
> > > > >> adopted and consumed by downstream projects.
> > > > >>
> > > > >> In my opinion, nothing in any project's test/* source directories
> > > > >> should be considered public, supported, and supportable. Test
> > > > >> resources within a project exist to test that project, including
> its
> > > > >> private internals.
> > > > >>
> > > > >> What I would recommend, fwiw is two things:
> > > > >>
> > > > >> 1. Explicitly release a supported hbase-testing-util artifact,
> with
> > > > >> Public and LimitedPrivate interfaces, with code in src/ not test/.
> > > > >> 2. Bring back HBTU, but as a compatibility shim. Provide
> deprecated
> > > > >> access to HBTU for Phoenix, marked LP(Phoenix), with this
> deprecated
> > > > >> accessor to be removed in HBase 4.0, along with the HBTU
> interface.
> > > > >>
> > > > >>> On Mon, Jul 19, 2021 at 12:15 PM Geoffrey Jacoby <
> > [email protected]
> > > >
> > > > >>> wrote:
> > > > >>>
> > > > >>> Can this be a [DISCUSS] rather than a [NOTICE]? The implications
> > for
> > > > >>> downstream projects (both Phoenix and many internal projects) are
> > > > large,
> > > > >>> and it seems like something that needs broader discussion before
> > > being
> > > > >> set
> > > > >>> in stone. The HBaseTestingUtility is used extensively in Phoenix,
> > as
> > > > well
> > > > >>> as in many internal projects at my dayjob (some directly and some
> > > > through
> > > > >>> Phoenix's BaseTest wrapping of HBTU) -- it's quite useful.
> > > > >>>
> > > > >>> The idea of a better-encapsulated, easier-to-use HBase testing
> > > utility
> > > > >> is a
> > > > >>> good one, and the TestingHBaseCluster interface looks like a
> > definite
> > > > >>> improvement. However, I notice at least one large gap right away:
> > > there
> > > > >>> doesn't appear to be a way to inject a custom Configuration
> object
> > > into
> > > > >> the
> > > > >>> test cluster, which is a very common pattern. (Example: run a
> test
> > > > suite
> > > > >>> twice with a new minicluster each time, once with a flag off and
> > then
> > > > >> on.)
> > > > >>> This seems like a simple fix.
> > > > >>>
> > > > >>> More concerning is the underlying assumption of the change, that
> > only
> > > > the
> > > > >>> HBase project, and perhaps Phoenix, will ever need to write a
> test
> > of
> > > > >>> server-side components. That's simply not the case, because HBase
> > has
> > > > >> many
> > > > >>> integration points that allow downstream developed code to run in
> > > > >>> server-side processes.
> > > > >>>
> > > > >>> These include:
> > > > >>> Coprocessor Observers and Endpoints
> > > > >>> Replication Endpoints
> > > > >>> MapReduce integration (which acts as a client from HBase's
> > > perspective
> > > > >> but
> > > > >>> runs within YARN services)
> > > > >>>
> > > > >>> In addition, Phoenix supports user-defined functions (UDFs)
> which I
> > > > >> believe
> > > > >>> can run server-side within a coproc in certain query plans.
> > > > >>>
> > > > >>> The change assumes that no one will ever need direct access to
> the
> > > > >> testing
> > > > >>> utility's internal ZooKeeper, MR, or DFS services, but this seems
> > > > >> relevant
> > > > >>> to failure scenario tests of both Replication Endpoints and
> > MapReduce
> > > > >> jobs.
> > > > >>> The Admin API may be able to replace quite a lot of existing
> logic
> > > > going
> > > > >>> forward, and many existing tests already use it rather than the
> > test
> > > > >>> utility directly.  But there are literally thousands of
> downstream
> > > > tests
> > > > >> to
> > > > >>> analyze across many different organizations and institutions to
> > > verify
> > > > >> that
> > > > >>> nothing important is being lost, and that will take time. Just
> > > leaving
> > > > a
> > > > >>> reference to the old, lower-level HBTU as a public property of
> the
> > > new
> > > > >>> interface seems lower-risk to me. What are the gains from hiding
> > the
> > > > >>> existing HBTU?
> > > > >>>
> > > > >>> Geoffrey
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Sun, Jul 18, 2021 at 9:44 PM 张铎(Duo Zhang) <
> > [email protected]
> > > >
> > > > >> wrote:
> > > > >>>
> > > > >>>> Please see the discussion in
> > > > >>>> https://issues.apache.org/jira/browse/HBASE-13126
> > > > >>>>
> > > > >>>> And final work is done in
> > > > >>>> https://issues.apache.org/jira/browse/HBASE-26081
> > > > >>>> https://github.com/apache/hbase/pull/3478
> > > > >>>>
> > > > >>>> The original HBaseTestingUtility has been renamed to
> > > HBaseTestingUtil,
> > > > >> and
> > > > >>>> MiniHBaseCluster has been renamed to SingleProcessHBaseCluster.
> > Now
> > > > >> they
> > > > >>>> are not expected to be used by end users any more. We marked it
> as
> > > > >>>> IA.LimitedPrivate("Phoenix"), as maybe the Phoenix project may
> > still
> > > > >> need
> > > > >>>> to test something internal to HBase.
> > > > >>>>
> > > > >>>> Anyway, we encourage every downstream projects(including
> Phoenix)
> > to
> > > > >> try to
> > > > >>>> make use of the new TestingHBaseCluster introduced in
> > > > >>>> https://issues.apache.org/jira/browse/HBASE-26080
> > > > >>>>
> > > > >>>> We can keep improving it if the current API set is not enough.
> > > > >>>>
> > > > >>>> ==== 简略的中文版通知,非直译 ====
> > > > >>>>
> > > > >>>> HBaseTestingUtility 已经在 3.0.0 中被标记为 Deprecated,请所有用户尽量尝试使用在
> > > > HBASE-26080
> > > > >>>> 中引入的 TestingHBaseCluster。有任何需求请随时反馈,我们会持续优化。
> > > > >>>>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Best regards,
> > > > >> Andrew
> > > > >>
> > > > >> Words like orphans lost among the crosstalk, meaning torn from
> > truth's
> > > > >> decrepit hands
> > > > >>   - A23, Crosstalk
> > > > >>
> > > >
> > >
> >
>

Reply via email to