Great! I'll work on getting the PR into an actual, proper shape now,
including looking at found violations more carefully and eventually
freezing current violations (maybe removing some quick-wins).

One more thing I just ran into is that ArchUnit doesn't explicitly support
Scala; while many things just work (since it's still byte code),
Scala-specific concepts like traits seem to cause issues. I'll have to
exclude Scala code from the checks for now, I think.


Ingo

On Tue, Sep 7, 2021 at 5:03 PM Chesnay Schepler <ches...@apache.org> wrote:

> I would say that's fine time-wise.
>
> On 07/09/2021 15:29, Ingo Bürk wrote:
> > Thanks, Chesnay. I updated the PR to use a separate module now, and ran
> it
> > on a few modules (some Table API modules and a couple connectors). The CI
> > seemed to take ~2.5min for executing the tests; that's certainly not
> > negligible. On the other hand, even the few tests implemented already
> found
> > several violations ("several" is an understatement, but I manually
> verified
> > some of them, not all of them).
> >
> > On Mon, Sep 6, 2021 at 3:44 PM Chesnay Schepler <ches...@apache.org>
> wrote:
> >
> >> While flink-tests is currently the best choice in that it has the
> >> biggest classpath, it is also the module already requiring the most time
> >> on CI.
> >>
> >> Furthermore, given that we ideally cover all APIs (including connectors
> >> & formats), having that mess of dependencies in flink-tests may
> >> interfere with existing / future tests.
> >>
> >> As such I would prefer a separate module, as annoying as that may be.
> >>
> >> On 06/09/2021 15:26, Ingo Bürk wrote:
> >>> I just quickly chatted with the author/maintainer of ArchUnit, and a
> >> module
> >>> which depends on every module that should be tested seems to be the
> best
> >>> solution. How do you feel about using flink-tests for this vs. having a
> >>> separate module for this purpose?
> >>>
> >>>
> >>> Ingo
> >>>
> >>> On Mon, Sep 6, 2021 at 3:04 PM Ingo Bürk <i...@ververica.com> wrote:
> >>>
> >>>> Hi Chesnay,
> >>>>
> >>>> Those are all great questions, and I want to tackle those as well. For
> >> the
> >>>> moment I went per-module, but runtime-wise that isn't ideal the more
> >>>> modules we'd activate this in. ArchUnit does cache classes between
> >> tests,
> >>>> but if we run them individually per module, we'd still add up quite a
> >> bit
> >>>> of execution time (a single module in my IDE is around 10s with the
> >> tests I
> >>>> currently have implemented, but I suspect the bottleneck here is the
> >>>> importing of classes, not the number of tests). Ideally we'd just run
> >> them
> >>>> once in a module with a big enough classpath to cover everything. If
> we
> >>>> have such a place, that would probably be our best shot. I'll also
> keep
> >>>> investigating here, of course.
> >>>>
> >>>> For now I just pushed a solution to avoid the overlap when executing
> it
> >>>> per-module by matching on the URI. It's not the prettiest solution,
> but
> >>>> does work; but that's more to not fail the tests in unrelated modules
> >> and
> >>>> doesn't help much with execution time.
> >>>>
> >>>>
> >>>> Ingo
> >>>>
> >>>> On Mon, Sep 6, 2021 at 1:57 PM Chesnay Schepler <ches...@apache.org>
> >>>> wrote:
> >>>>
> >>>>> Do you have an estimate for long these tests would run for?
> >>>>>
> >>>>> For project-wide tests, what are the options for setting that up?
> >>>>> If we let the tests run per-module then I guess they'd overlap
> >>>>> considerably (because other Flink modules are being put on the
> >>>>> classpath), which isn't ideal.
> >>>>>
> >>>>> On 06/09/2021 13:51, David Morávek wrote:
> >>>>>> Hi Ingo,
> >>>>>>
> >>>>>> +1 for this effort. This could automate lot of "written rules" that
> >> are
> >>>>>> easy to forget about / not to be aware of (such as that each test
> >> should
> >>>>>> extend the TestLogger as Till has already mentioned).
> >>>>>>
> >>>>>> I went trough your examples and ArchUnit looks really powerful and
> >>>>>> expressive while still being easy to read.
> >>>>>>
> >>>>>> Best,
> >>>>>> D.
> >>>>>>
> >>>>>> On Mon, Sep 6, 2021 at 1:00 PM Ingo Bürk <i...@ververica.com>
> wrote:
> >>>>>>
> >>>>>>> Thanks for your input Chesnay!
> >>>>>>>
> >>>>>>> The limitations of ArchUnit probably mostly stem from the fact that
> >> it
> >>>>>>> operates on byte code and thus can't access anything not accessible
> >>>>> from
> >>>>>>> byte code, i.e. JavaDocs. But I think Checkstyle and ArchUnit are
> >>>>>>> complementing each other quite well here. The main reason against
> >>>>>>> Checkstyle for these tests is its limitation to single files only,
> >>>>>>> rendering many tests (including the one you mentioned) impossible.
> >> The
> >>>>>>> secondary reason is that ArchUnit has more declarative APIs and the
> >>>>> tests
> >>>>>>> become quite easy to write and maintain (some groundwork effort is
> >>>>> needed,
> >>>>>>> of course). Over time we could probably expand quite a bit more on
> >>>>> what is
> >>>>>>> tested with ArchUnit as it can test entire architectures (package
> >>>>> accesses
> >>>>>>> etc.), and it has support for "freezing" known violations to
> prevent
> >>>>> new
> >>>>>>> violations and removing existing ones over time.
> >>>>>>>
> >>>>>>> The @VisibleForTesting use case you mentioned is possible; I've
> >> pushed
> >>>>> a
> >>>>>>> version of that rule to the draft PR now as well.
> >>>>>>>
> >>>>>>>
> >>>>>>> Best
> >>>>>>> Ingo
> >>>>>>>
> >>>>>>> On Mon, Sep 6, 2021 at 12:11 PM Chesnay Schepler <
> ches...@apache.org
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> This sounds like an interesting effort.
> >>>>>>>>
> >>>>>>>> The draft you have opened uses ArchUnit; can you explain a bit
> what
> >>>>> the
> >>>>>>>> capabilities/limitations of said tool are?
> >>>>>>>>
> >>>>>>>> One thing we wanted to have for a long time is that
> methods/classes
> >>>>>>>> annotated with @VisibleForTesting are not called from production
> >> code;
> >>>>>>>> would that be something that could be implemented?
> >>>>>>>>
> >>>>>>>> It's not a problem imo that tests need to run in order to catch
> >> stuff;
> >>>>>>>> so long as it is noticed on CI.
> >>>>>>>>
> >>>>>>>> On 03/09/2021 08:48, Ingo Bürk wrote:
> >>>>>>>>> Hi Timo, Till,
> >>>>>>>>>
> >>>>>>>>> thanks for your input already. I'm glad to hear that the idea
> >>>>>>> resonates,
> >>>>>>>>> also thanks for the additional ideas!
> >>>>>>>>>
> >>>>>>>>> I've created a JIRA issue[1] for now just to track this idea. I'm
> >>>>> also
> >>>>>>>>> working on a bit of a proof of concept and opened it as a draft
> >>>>> PR[2].
> >>>>>>>> I'm
> >>>>>>>>> happy for anyone to join that PR to look and discuss. The PR
> >> doesn't
> >>>>>>>>> necessarily intend to be merged in its current state, but is
> rather
> >>>>> for
> >>>>>>>>> evaluation.
> >>>>>>>>>
> >>>>>>>>> Meanwhile I'm also collecting ideas in a Google Doc, so if anyone
> >>>>> wants
> >>>>>>>> to
> >>>>>>>>> suggest more use cases to explore or implement, please let me
> know.
> >>>>>>>>>
> >>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-24138
> >>>>>>>>> [2] https://github.com/apache/flink/pull/17133
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Best
> >>>>>>>>> Ingo
> >>>>>>>>>
> >>>>>>>>> On Thu, Sep 2, 2021 at 12:31 PM Till Rohrmann <
> >> trohrm...@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>>> If it is possible to automate these kinds of checks, then I am
> all
> >>>>> for
> >>>>>>>> it
> >>>>>>>>>> because everything else breaks eventually. So +1 for this
> >> proposal.
> >>>>>>>>>> I don't have experience with what tools are available, though.
> >>>>>>>>>>
> >>>>>>>>>> I would like to add a rule that every test class extends
> directly
> >> or
> >>>>>>>>>> indirectly TestLogger because otherwise it is super hard to make
> >>>>> sense
> >>>>>>>> of
> >>>>>>>>>> the test logs (Arvid will probably chime in stating that this
> will
> >>>>> be
> >>>>>>>>>> solved with Junit5 eventually).
> >>>>>>>>>>
> >>>>>>>>>> Not sure whether this is possible or not but if we can check
> that
> >>>>> all
> >>>>>>>>>> interfaces have properly defined JavaDocs on each method, then
> >> this
> >>>>>>>> could
> >>>>>>>>>> also be helpful in my opinion.
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>> Till
> >>>>>>>>>>
> >>>>>>>>>> On Thu, Sep 2, 2021 at 11:16 AM Timo Walther <
> twal...@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>>>> Hi Ingo,
> >>>>>>>>>>>
> >>>>>>>>>>> thanks for starting this discussion. Having more automation is
> >>>>>>>>>>> definitely desirable. Esp. in the API / SDK areas where we
> >>>>> frequently
> >>>>>>>>>>> have to add similar comments to PRs. The more checks the
> better.
> >> We
> >>>>>>>>>>> definitely also need more guidelines (e.g. how to develop a
> Flink
> >>>>>>>>>>> connector) but automation is safer then long checklists that
> >> might
> >>>>> be
> >>>>>>>>>>> out of date quickly.
> >>>>>>>>>>>
> >>>>>>>>>>> +1 to the proposal. I don't have an opinion on the tool though.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Timo
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 01.09.21 11:03, Ingo Bürk wrote:
> >>>>>>>>>>>> Hello everyone,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I would like to start a discussion on introducing automated
> >> tests
> >>>>>>> for
> >>>>>>>>>>> more
> >>>>>>>>>>>> architectural rather than stilistic topics. For example, here
> >> are
> >>>>> a
> >>>>>>>> few
> >>>>>>>>>>>> things that seem worth checking to me (this is
> Table-API-focused
> >>>>>>> since
> >>>>>>>>>> it
> >>>>>>>>>>>> is the subsystem I'm involved in):
> >>>>>>>>>>>>
> >>>>>>>>>>>> (a) All classes in o.a.f.table.api should be annotated with
> one
> >>>>>>>>>>>> of @Internal, @PublicEvolving, or @Public.
> >>>>>>>>>>>> (b) Classes whose name ends in *ConnectorOptions should be
> >> located
> >>>>>>> in
> >>>>>>>>>>>> o.a.f.connector.*.table
> >>>>>>>>>>>> (c) Classes implementing DynamicSourceFactory /
> >> DynamicSinkFactory
> >>>>>>>>>> should
> >>>>>>>>>>>> have no static members of type ConfigOption
> >>>>>>>>>>>>
> >>>>>>>>>>>> There are probably significantly more cases worth checking,
> and
> >>>>> also
> >>>>>>>>>> more
> >>>>>>>>>>>> involved ones (these are rather simple examples), like
> >> disallowing
> >>>>>>>>>> access
> >>>>>>>>>>>> between certain packages etc. There are two questions I would
> >> like
> >>>>>>> to
> >>>>>>>>>> ask
> >>>>>>>>>>>> to the community:
> >>>>>>>>>>>>
> >>>>>>>>>>>> (1) Do you think such tests are useful in general?
> >>>>>>>>>>>> (2) What use cases come to mind for you?
> >>>>>>>>>>>>
> >>>>>>>>>>>> If the idea finds consensus, I would like to use (2) to
> >>>>> investigate
> >>>>>>>>>> which
> >>>>>>>>>>>> tooling to use. An obvious candidate is Checkstyle, as this is
> >>>>>>> already
> >>>>>>>>>>>> used. It also has the advantage of being well integrated in
> the
> >>>>> IDE.
> >>>>>>>>>>>> However, it is limited to looking at single files only, and
> >> custom
> >>>>>>>>>> checks
> >>>>>>>>>>>> are pretty complicated and involved to implement[1]. Another
> >>>>>>> possible
> >>>>>>>>>>> tool
> >>>>>>>>>>>> is ArchUnit[2], which would be significantly easier to
> maintain
> >>>>> and
> >>>>>>> is
> >>>>>>>>>>> more
> >>>>>>>>>>>> powerful, but in turn requires tests to be executed. If you
> have
> >>>>>>>>>> further
> >>>>>>>>>>>> suggestions (or thoughts) they would of course also be quite
> >>>>>>> welcome,
> >>>>>>>>>>>> though for now I would focus on (1) and (2) and go from there
> to
> >>>>>>>>>>> evaluate.
> >>>>>>>>>>>> [1] https://checkstyle.sourceforge.io/writingchecks.html
> >>>>>>>>>>>> [2] https://www.archunit.org/
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best
> >>>>>>>>>>>> Ingo
> >>>>>>>>>>>>
> >>
>
>

Reply via email to