Re: [DISCUSS] Change "RunnableOnService" To A More Intuitive Name

Jason Kuster Tue, 28 Mar 2017 12:40:09 -0700

Going off of what Robert and others have said -- I think the sanest way to
use NeedsRunner is as an annotation on specific tests in the set of modules
which the DirectRunner depends on (directly or transitively). It should be
used only as an indication that they should be tested with the DirectRunner
once the DirectRunner is available. In this light, I'd suggest removing
NeedsRunner annotations from everywhere below the DirectRunner in the build
order since it will already be available.


The one wrinkle here is that NeedsRunner is used in Stas's abandoned node
detection in TestPipeline, a very useful piece of functionality. My
suggestion here would be to just turn on abandoned node detection by
default; tests which do not need it can call
p.enableAbandonedNodeEnforcement(false). Thoughts?

On Tue, Mar 28, 2017 at 1:55 AM, Stas Levin <[email protected]> wrote:

> Jason, I can add the write-up in
> https://github.com/apache/beam/pull/2077#issuecomment-282273273 to the
> testing
> section <https://beam.apache.org/contribute/testing/> as part of the
> upcoming doc updates in light of "RunnableOnService" becoming
> "NeedsRunner".
>
> -Stas
>
>
> On Tue, Mar 28, 2017 at 12:38 AM Robert Bradshaw
> <[email protected]> wrote:
>
> > On Mon, Mar 27, 2017 at 1:16 PM, Eugene Kirpichov <
> > [email protected]> wrote:
> >
> > > Thanks Kenn, this breakdown makes a lot of sense. We should probably
> > > clarify this in the documentation of both of these annotations. Though
> > now
> > > that I look at the current docs, they seem clear enough, but perhaps
> they
> > > can be phrased stronger.
> > >
> > > To summarize:
> > > - ValidatesRunner tests are for testing a runner. They should be
> created
> > > mostly by core Beam SDK authors, when introducing a new Beam feature
> etc.
> > > It may even make sense to make this annotation private to the beam sdk
> > core
> > > module.
> > >
> >
> > +1, but see below.
> >
> >
> > > - NeedsRunner tests are for testing a transform. That's what all
> external
> > > users should be using, authors of IOs, etc. It's unspecified which
> runner
> > > will be used, but in practice usually it'll be direct runner.
> > >
> >
> > My one issue with NeedsRunner is that as one moves further from core, the
> > public API of Beam libraries tend to be suites of PTransforms (including
> > IOs) and almost every strong test of the public API (which is where most
> > tests should live) has at least one test that becomes NeedsRunner. It
> seems
> > annotations should be used to mark the exception, not the norm.
> >
> > Perhaps this also stems from my point of view that the Direct Runner is
> in
> > many ways part of the SDK as the reference implementation and test runner
> > (the SDK minus any runner isn't very useful) and so there's little or no
> > need to distinguish tests that require a runner from those that don't
> > (especially the further you get from core, where you should be able to
> > assume you have a runner if you have the ability to construct pipelines
> and
> > run PAssert).
> >
> > - The current situation (use of both of these annotations) does not quite
> > > reflect that, and should be fixed.
> > > - There might be a use case for testing a transform against all
> runners,
> > > and we don't have an agreed-upon solution about how to do that:
> > > ValidatesRunner technically accomplishes this, but it's logically wrong
> > to
> > > use it in this capacity.
> > >
> >
> > I think such tests do validate runners insofar as the @ValidatesRunner
> > suites are not comprehensive.
> >
> > In the ideal world, one could simply assume the primitive transforms are
> > perfectly tested with 100% coverage of all relevant permutations of
> > features, and tests of composite transforms can assume correct(*)
> > implementation of primitives and only test their assembly on a single
> > runner. In practice, our tests are not comprehensive, and these
> > higher-level tests form a much larger suite of additional tests biased
> > towards how the primitive are actually composed and used. Somewhat like a
> > second line of defense. As such, its useful to be able to run this larger
> > suite on a variety of runners, even if it's theoretically redundant.
> >
> > (There's also the sticky issue that if a runner overrides a composite
> > transform there's suddenly value in running that transform's tests
> against
> > that runner. It's unclear where that information should live (seems to
> > belong to the runner, but it should just reference (and possibly augment)
> > the existing suite).)
> >
> > (*) Even the notion of correctness is difficult here, as one might be
> > relying of hidden assumptions that don't hold on all runners because they
> > are not part of the spec. This is the case for the wild-and-crazy test
> > runner that goes out of its way to violate assumptions.
> >
> > On Mon, Mar 27, 2017 at 12:50 PM Kenneth Knowles <[email protected]
> >
> > > wrote:
> > >
> > > > Without claiming that this is the final set of categories or that
> they
> > > are
> > > > used correctly right now, here is what I think they mean:
> > > >
> > > >  - ValidatesRunner tests should be tests of the runner itself,
> > generally
> > > > that it implements a primitive correctly
> > > >  - NeedsRunner tests should be tests of the PTransform/pipeline,
> > assuming
> > > > the runner is correct
> > > >
> > > > Notably, to the extent the assumption of runner correctness holds,
> this
> > > > implies that it is OK to run NeedsRunner tests with just the direct
> > > runner.
> > > >
> > > > Pragmatically, in the Java SDK & IOs, this is not the current
> > breakdown.
> > > >
> > > >  - In the Java SDK the NeedsRunner category is probably used more to
> > flag
> > > > "run this with just the direct runner" than to express the semantic
> > > intent.
> > > > That isn't so bad; it is very close to the right usage.
> > > >
> > > >  - There are IOs that had RunnableOnService tests which are now
> > > > ValidatesRunner tests. While the ability to run an IO does validate a
> > > > runner, this is really an integration test of the IO. If they are to
> be
> > > run
> > > > with just the direct runner they don't need any annotation, because
> the
> > > IO
> > > > can take a test-scoped dependency on the direct runner. So it mostly
> > > makes
> > > > sense to tag those tests for which it is profitable to run against
> all
> > > > runners.
> > > >
> > > > I think the question of IO ITs with the intent to run across runners
> is
> > > > currently under design discussion and I would defer to other people
> on
> > > the
> > > > best way to do that. It could be a new category, or it could be a
> > > different
> > > > design pattern entirely.
> > > >
> > > > Kenn
> > > >
> > > > On Mon, Mar 27, 2017 at 11:55 AM, Eugene Kirpichov <
> > > > [email protected]> wrote:
> > > >
> > > > > Kenn - can you also remind for everybody, what is the difference
> > > between
> > > > > @NeedsRunner and @ValidatesRunner, and when should one use one or
> the
> > > > > other? I always find myself confused about this especially in code
> > > > reviews.
> > > > >
> > > > > On Mon, Mar 27, 2017 at 11:32 AM Kenneth Knowles
> > > <[email protected]
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I just merged the rename from RunnableOnService to
> ValidatesRunner
> > in
> > > > the
> > > > > > Java codebase (Python was already there)
> > > > > > https://github.com/apache/beam/pull/2157.
> > > > > >
> > > > > > I'm sure there will be stragglers throughout our docs, etc, so
> > please
> > > > do
> > > > > > help me catch them and fix them. And start learning to say
> > > > > > "ValidatesRunner" in conversation :-)
> > > > > >
> > > > > > Kenn
> > > > > >
> > > > > > On Thu, Nov 10, 2016 at 1:01 PM, Lukasz Cwik
> > > <[email protected]
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > The default is a crashing runner which throws an exception if
> its
> > > > > > executed.
> > > > > > > This makes SDK core/examples/... not depend on any implemented
> > > > runners.
> > > > > > >
> > > > > > > On Thu, Nov 10, 2016 at 12:37 PM, Robert Bradshaw <
> > > > > > > [email protected]> wrote:
> > > > > > >
> > > > > > > > +1 to ValidatesRunner. I'd be nice if it were (optionally?)
> > > > > > > > parameterized by which feature it validates.
> > > > > > > >
> > > > > > > > @NeedsRunner is odd, as using a runner is the most natural
> way
> > to
> > > > > > > > write many (most) tests, but an annotation should be used to
> > mark
> > > > the
> > > > > > > > exception, not the norm. (I'd just assume a runner is
> available
> > > for
> > > > > > > > all tests, e.g. CoreTests depends on DirectRunner depends on
> > > Core).
> > > > > > > >
> > > > > > > > On Thu, Nov 10, 2016 at 10:14 AM, Mark Liu
> > > > > <[email protected]
> > > > > > >
> > > > > > > > wrote:
> > > > > > > > > +1 ValidatesRunner
> > > > > > > > >
> > > > > > > > > On Thu, Nov 10, 2016 at 8:40 AM, Kenneth Knowles
> > > > > > > <[email protected]
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> Nice. I like ValidatesRunner.
> > > > > > > > >>
> > > > > > > > >> On Nov 10, 2016 03:39, "Amit Sela" <[email protected]>
> > > > wrote:
> > > > > > > > >>
> > > > > > > > >> > How about @ValidatesRunner ?
> > > > > > > > >> > Seems to complement @NeedsRunner as well.
> > > > > > > > >> >
> > > > > > > > >> > On Thu, Nov 10, 2016 at 9:47 AM Aljoscha Krettek <
> > > > > > > [email protected]
> > > > > > > > >
> > > > > > > > >> > wrote:
> > > > > > > > >> >
> > > > > > > > >> > > +1
> > > > > > > > >> > >
> > > > > > > > >> > > What I would really like to see is automatic
> derivation
> > of
> > > > the
> > > > > > > > >> capability
> > > > > > > > >> > > matrix from an extended Runner Test Suite. (As
> outlined
> > in
> > > > > > Thomas'
> > > > > > > > >> doc).
> > > > > > > > >> > >
> > > > > > > > >> > > On Wed, 9 Nov 2016 at 21:42 Kenneth Knowles
> > > > > > > <[email protected]
> > > > > > > > >
> > > > > > > > >> > > wrote:
> > > > > > > > >> > >
> > > > > > > > >> > > > Huge +1 to this.
> > > > > > > > >> > > >
> > > > > > > > >> > > > The two categories I care most about are:
> > > > > > > > >> > > >
> > > > > > > > >> > > > 1. Tests that need a runner, but are testing the
> other
> > > > > "thing
> > > > > > > > under
> > > > > > > > >> > > test";
> > > > > > > > >> > > > today this is NeedsRunner.
> > > > > > > > >> > > > 2. Tests that are intended to test a runner; today
> > this
> > > is
> > > > > > > > >> > > > RunnableOnService.
> > > > > > > > >> > > >
> > > > > > > > >> > > > Actually the lines are not necessary clear between
> > them,
> > > > > but I
> > > > > > > > think
> > > > > > > > >> we
> > > > > > > > >> > > can
> > > > > > > > >> > > > make good choices, like we already do.
> > > > > > > > >> > > >
> > > > > > > > >> > > > The idea of two categories with a common superclass
> > > > actually
> > > > > > > has a
> > > > > > > > >> > > pitfall:
> > > > > > > > >> > > > what if a test is put in the superclass category,
> when
> > > it
> > > > > does
> > > > > > > not
> > > > > > > > >> > have a
> > > > > > > > >> > > > clear meaning? And also, I don't have any good ideas
> > for
> > > > > > names.
> > > > > > > > >> > > >
> > > > > > > > >> > > > So I think just replacing RunnableOnService with
> > > > RunnerTest
> > > > > to
> > > > > > > > make
> > > > > > > > >> > clear
> > > > > > > > >> > > > that it is there just to test the runner is good. We
> > > might
> > > > > > also
> > > > > > > > want
> > > > > > > > >> > > > RunnerIntegrationTest extends NeedsRunner to use in
> > the
> > > IO
> > > > > > > > modules.
> > > > > > > > >> > > >
> > > > > > > > >> > > > See also Thomas's doc on capability matrix testing*
> > > which
> > > > is
> > > > > > > > aimed at
> > > > > > > > >> > > case
> > > > > > > > >> > > > 2. Those tests should all have a category from the
> > doc,
> > > > or a
> > > > > > new
> > > > > > > > one
> > > > > > > > >> > > added.
> > > > > > > > >> > > >
> > > > > > > > >> > > > *
> > > > > > > > >> > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > https://docs.google.com/document/d/
> 1fICxq32t9yWn9qXhmT07xpclHeHX2
> > > > > > > > >> > VlUyVtpi2WzzGM/edit
> > > > > > > > >> > > >
> > > > > > > > >> > > > Kenn
> > > > > > > > >> > > >
> > > > > > > > >> > > > On Wed, Nov 9, 2016 at 12:20 PM, Jean-Baptiste
> Onofré
> > <
> > > > > > > > >> [email protected]
> > > > > > > > >> > >
> > > > > > > > >> > > > wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > > > Hi Mark,
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > Generally speaking, I agree.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > As RunnableOnService extends NeedsRunner,
> > > > @TestsWithRunner
> > > > > > or
> > > > > > > > >> > > > @RunOnRunner
> > > > > > > > >> > > > > sound clearer.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > Regards
> > > > > > > > >> > > > > JB
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > On 11/09/2016 09:00 PM, Mark Liu wrote:
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >> Hi all,
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >> I'm working on building RunnableOnService in
> Python
> > > > SDK.
> > > > > > > After
> > > > > > > > >> > having
> > > > > > > > >> > > > >> discussions with folks, "RunnableOnService" looks
> > > like
> > > > > not
> > > > > > a
> > > > > > > > very
> > > > > > > > >> > > > >> intuitive
> > > > > > > > >> > > > >> name for those unit tests that require runners
> and
> > > > build
> > > > > > > > >> lightweight
> > > > > > > > >> > > > >> pipelines to test specific components.
> Especially,
> > > they
> > > > > > don't
> > > > > > > > have
> > > > > > > > >> > to
> > > > > > > > >> > > > run
> > > > > > > > >> > > > >> on a service.
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >> So I want to raise this idea to the community and
> > see
> > > > if
> > > > > > > anyone
> > > > > > > > >> have
> > > > > > > > >> > > > >> similar thoughts. Maybe we can come up with a
> name
> > > this
> > > > > is
> > > > > > > > tight
> > > > > > > > >> to
> > > > > > > > >> > > > >> runner.
> > > > > > > > >> > > > >> Currently, I have two names in my head:
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >> - TestsWithRunners
> > > > > > > > >> > > > >> - RunnerExecutable
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >> Any thoughts?
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >> Thanks,
> > > > > > > > >> > > > >> Mark
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > > --
> > > > > > > > >> > > > > Jean-Baptiste Onofré
> > > > > > > > >> > > > > [email protected]
> > > > > > > > >> > > > > http://blog.nanthrax.net
> > > > > > > > >> > > > > Talend - http://www.talend.com
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
-------
Jason Kuster
Apache Beam / Google Cloud Dataflow

Re: [DISCUSS] Change "RunnableOnService" To A More Intuitive Name

Reply via email to