I haven't been involved with ValidatesRunner/NeedsRunner too much so I'll
avoid commenting on whether a particular interpretation is correct or not.

re: "- There might be a use case for testing a transform against all
runners,
and we don't have an agreed-upon solution about how to do that:
ValidatesRunner technically accomplishes this, but it's logically wrong to
use it in this capacity."

If I take the words "Needs Runner" at face value, what is is saying is that
in order to run the particular test, it must have a pipeline runner in
order to execute. Is there anything stopping us from pulling the set of
NeedsRunner tests and running them against all runners? I assume the
problem with that is that it multiplies the size of our test matrix, and
from a logical perspective we shouldn't *need* to do that.

So then I think the problem you're getting at is that we want to indicate
"this is a test that needs to be run on multiple runners, but is not
formally part of validating them." That is: when the test breaks, it's
likely a problem with the transform, not the runner. Do we think there's a
large category of these? If there was, that would seem to be a distinct
problem from the other two annotations and worthy of a new one. Things like
the BatchingDoFn might be an example.  @NeedsAllRunners ??

As a side comment - I've been discouraging folks from adding either of
these annotations on *IT.java files used for IO ITs, since I think that an
IT should by definition involve using a runner, and we have specific plans
for how we want to run IO ITs. (The IO *Test.java files should still have
@NeedsRunner) I thought I'd mention it since it is related to this topic.

S


On Mon, Mar 27, 2017 at 1:16 PM Eugene Kirpichov
<[email protected]> wrote:

> Thanks Kenn, this breakdown makes a lot of sense. We should probably
> clarify this in the documentation of both of these annotations. Though now
> that I look at the current docs, they seem clear enough, but perhaps they
> can be phrased stronger.
>
> To summarize:
> - ValidatesRunner tests are for testing a runner. They should be created
> mostly by core Beam SDK authors, when introducing a new Beam feature etc.
> It may even make sense to make this annotation private to the beam sdk core
> module.
> - NeedsRunner tests are for testing a transform. That's what all external
> users should be using, authors of IOs, etc. It's unspecified which runner
> will be used, but in practice usually it'll be direct runner.
> - The current situation (use of both of these annotations) does not quite
> reflect that, and should be fixed.
> - There might be a use case for testing a transform against all runners,
> and we don't have an agreed-upon solution about how to do that:
> ValidatesRunner technically accomplishes this, but it's logically wrong to
> use it in this capacity.
>
> Right?
>
> On Mon, Mar 27, 2017 at 12:50 PM Kenneth Knowles <[email protected]>
> wrote:
>
> > Without claiming that this is the final set of categories or that they
> are
> > used correctly right now, here is what I think they mean:
> >
> >  - ValidatesRunner tests should be tests of the runner itself, generally
> > that it implements a primitive correctly
> >  - NeedsRunner tests should be tests of the PTransform/pipeline, assuming
> > the runner is correct
> >
> > Notably, to the extent the assumption of runner correctness holds, this
> > implies that it is OK to run NeedsRunner tests with just the direct
> runner.
> >
> > Pragmatically, in the Java SDK & IOs, this is not the current breakdown.
> >
> >  - In the Java SDK the NeedsRunner category is probably used more to flag
> > "run this with just the direct runner" than to express the semantic
> intent.
> > That isn't so bad; it is very close to the right usage.
> >
> >  - There are IOs that had RunnableOnService tests which are now
> > ValidatesRunner tests. While the ability to run an IO does validate a
> > runner, this is really an integration test of the IO. If they are to be
> run
> > with just the direct runner they don't need any annotation, because the
> IO
> > can take a test-scoped dependency on the direct runner. So it mostly
> makes
> > sense to tag those tests for which it is profitable to run against all
> > runners.
> >
> > I think the question of IO ITs with the intent to run across runners is
> > currently under design discussion and I would defer to other people on
> the
> > best way to do that. It could be a new category, or it could be a
> different
> > design pattern entirely.
> >
> > Kenn
> >
> > On Mon, Mar 27, 2017 at 11:55 AM, Eugene Kirpichov <
> > [email protected]> wrote:
> >
> > > Kenn - can you also remind for everybody, what is the difference
> between
> > > @NeedsRunner and @ValidatesRunner, and when should one use one or the
> > > other? I always find myself confused about this especially in code
> > reviews.
> > >
> > > On Mon, Mar 27, 2017 at 11:32 AM Kenneth Knowles
> <[email protected]
> > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I just merged the rename from RunnableOnService to ValidatesRunner in
> > the
> > > > Java codebase (Python was already there)
> > > > https://github.com/apache/beam/pull/2157.
> > > >
> > > > I'm sure there will be stragglers throughout our docs, etc, so please
> > do
> > > > help me catch them and fix them. And start learning to say
> > > > "ValidatesRunner" in conversation :-)
> > > >
> > > > Kenn
> > > >
> > > > On Thu, Nov 10, 2016 at 1:01 PM, Lukasz Cwik
> <[email protected]
> > >
> > > > wrote:
> > > >
> > > > > The default is a crashing runner which throws an exception if its
> > > > executed.
> > > > > This makes SDK core/examples/... not depend on any implemented
> > runners.
> > > > >
> > > > > On Thu, Nov 10, 2016 at 12:37 PM, Robert Bradshaw <
> > > > > [email protected]> wrote:
> > > > >
> > > > > > +1 to ValidatesRunner. I'd be nice if it were (optionally?)
> > > > > > parameterized by which feature it validates.
> > > > > >
> > > > > > @NeedsRunner is odd, as using a runner is the most natural way to
> > > > > > write many (most) tests, but an annotation should be used to mark
> > the
> > > > > > exception, not the norm. (I'd just assume a runner is available
> for
> > > > > > all tests, e.g. CoreTests depends on DirectRunner depends on
> Core).
> > > > > >
> > > > > > On Thu, Nov 10, 2016 at 10:14 AM, Mark Liu
> > > <[email protected]
> > > > >
> > > > > > wrote:
> > > > > > > +1 ValidatesRunner
> > > > > > >
> > > > > > > On Thu, Nov 10, 2016 at 8:40 AM, Kenneth Knowles
> > > > > <[email protected]
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Nice. I like ValidatesRunner.
> > > > > > >>
> > > > > > >> On Nov 10, 2016 03:39, "Amit Sela" <[email protected]>
> > wrote:
> > > > > > >>
> > > > > > >> > How about @ValidatesRunner ?
> > > > > > >> > Seems to complement @NeedsRunner as well.
> > > > > > >> >
> > > > > > >> > On Thu, Nov 10, 2016 at 9:47 AM Aljoscha Krettek <
> > > > > [email protected]
> > > > > > >
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > +1
> > > > > > >> > >
> > > > > > >> > > What I would really like to see is automatic derivation of
> > the
> > > > > > >> capability
> > > > > > >> > > matrix from an extended Runner Test Suite. (As outlined in
> > > > Thomas'
> > > > > > >> doc).
> > > > > > >> > >
> > > > > > >> > > On Wed, 9 Nov 2016 at 21:42 Kenneth Knowles
> > > > > <[email protected]
> > > > > > >
> > > > > > >> > > wrote:
> > > > > > >> > >
> > > > > > >> > > > Huge +1 to this.
> > > > > > >> > > >
> > > > > > >> > > > The two categories I care most about are:
> > > > > > >> > > >
> > > > > > >> > > > 1. Tests that need a runner, but are testing the other
> > > "thing
> > > > > > under
> > > > > > >> > > test";
> > > > > > >> > > > today this is NeedsRunner.
> > > > > > >> > > > 2. Tests that are intended to test a runner; today this
> is
> > > > > > >> > > > RunnableOnService.
> > > > > > >> > > >
> > > > > > >> > > > Actually the lines are not necessary clear between them,
> > > but I
> > > > > > think
> > > > > > >> we
> > > > > > >> > > can
> > > > > > >> > > > make good choices, like we already do.
> > > > > > >> > > >
> > > > > > >> > > > The idea of two categories with a common superclass
> > actually
> > > > > has a
> > > > > > >> > > pitfall:
> > > > > > >> > > > what if a test is put in the superclass category, when
> it
> > > does
> > > > > not
> > > > > > >> > have a
> > > > > > >> > > > clear meaning? And also, I don't have any good ideas for
> > > > names.
> > > > > > >> > > >
> > > > > > >> > > > So I think just replacing RunnableOnService with
> > RunnerTest
> > > to
> > > > > > make
> > > > > > >> > clear
> > > > > > >> > > > that it is there just to test the runner is good. We
> might
> > > > also
> > > > > > want
> > > > > > >> > > > RunnerIntegrationTest extends NeedsRunner to use in the
> IO
> > > > > > modules.
> > > > > > >> > > >
> > > > > > >> > > > See also Thomas's doc on capability matrix testing*
> which
> > is
> > > > > > aimed at
> > > > > > >> > > case
> > > > > > >> > > > 2. Those tests should all have a category from the doc,
> > or a
> > > > new
> > > > > > one
> > > > > > >> > > added.
> > > > > > >> > > >
> > > > > > >> > > > *
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > https://docs.google.com/document/d/1fICxq32t9yWn9qXhmT07xpclHeHX2
> > > > > > >> > VlUyVtpi2WzzGM/edit
> > > > > > >> > > >
> > > > > > >> > > > Kenn
> > > > > > >> > > >
> > > > > > >> > > > On Wed, Nov 9, 2016 at 12:20 PM, Jean-Baptiste Onofré <
> > > > > > >> [email protected]
> > > > > > >> > >
> > > > > > >> > > > wrote:
> > > > > > >> > > >
> > > > > > >> > > > > Hi Mark,
> > > > > > >> > > > >
> > > > > > >> > > > > Generally speaking, I agree.
> > > > > > >> > > > >
> > > > > > >> > > > > As RunnableOnService extends NeedsRunner,
> > @TestsWithRunner
> > > > or
> > > > > > >> > > > @RunOnRunner
> > > > > > >> > > > > sound clearer.
> > > > > > >> > > > >
> > > > > > >> > > > > Regards
> > > > > > >> > > > > JB
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > On 11/09/2016 09:00 PM, Mark Liu wrote:
> > > > > > >> > > > >
> > > > > > >> > > > >> Hi all,
> > > > > > >> > > > >>
> > > > > > >> > > > >> I'm working on building RunnableOnService in Python
> > SDK.
> > > > > After
> > > > > > >> > having
> > > > > > >> > > > >> discussions with folks, "RunnableOnService" looks
> like
> > > not
> > > > a
> > > > > > very
> > > > > > >> > > > >> intuitive
> > > > > > >> > > > >> name for those unit tests that require runners and
> > build
> > > > > > >> lightweight
> > > > > > >> > > > >> pipelines to test specific components. Especially,
> they
> > > > don't
> > > > > > have
> > > > > > >> > to
> > > > > > >> > > > run
> > > > > > >> > > > >> on a service.
> > > > > > >> > > > >>
> > > > > > >> > > > >> So I want to raise this idea to the community and see
> > if
> > > > > anyone
> > > > > > >> have
> > > > > > >> > > > >> similar thoughts. Maybe we can come up with a name
> this
> > > is
> > > > > > tight
> > > > > > >> to
> > > > > > >> > > > >> runner.
> > > > > > >> > > > >> Currently, I have two names in my head:
> > > > > > >> > > > >>
> > > > > > >> > > > >> - TestsWithRunners
> > > > > > >> > > > >> - RunnerExecutable
> > > > > > >> > > > >>
> > > > > > >> > > > >> Any thoughts?
> > > > > > >> > > > >>
> > > > > > >> > > > >> Thanks,
> > > > > > >> > > > >> Mark
> > > > > > >> > > > >>
> > > > > > >> > > > >>
> > > > > > >> > > > > --
> > > > > > >> > > > > Jean-Baptiste Onofré
> > > > > > >> > > > > [email protected]
> > > > > > >> > > > > http://blog.nanthrax.net
> > > > > > >> > > > > Talend - http://www.talend.com
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to