Hi Ingo,
Thanks for driving this discussion.
Some use cases come to my mind, maybe those rules could be checked by the
same way.
1. new introduced `StreamExecNode` is implemented json
serialization/deserialization. Currently it is checked in
`JsonSerdeCoverageTest`.
2. new introduced `RelNode` could be covered by all `MetadataHandler`s.
Currently this rule not exists yet.

Best regards,
JING ZHANG


Ingo Bürk <i...@ververica.com> 于2021年9月9日周四 下午6:49写道:

> Great! I'll work on getting the PR into an actual, proper shape now,
> including looking at found violations more carefully and eventually
> freezing current violations (maybe removing some quick-wins).
>
> One more thing I just ran into is that ArchUnit doesn't explicitly support
> Scala; while many things just work (since it's still byte code),
> Scala-specific concepts like traits seem to cause issues. I'll have to
> exclude Scala code from the checks for now, I think.
>
>
> Ingo
>
> On Tue, Sep 7, 2021 at 5:03 PM Chesnay Schepler <ches...@apache.org>
> wrote:
>
> > I would say that's fine time-wise.
> >
> > On 07/09/2021 15:29, Ingo Bürk wrote:
> > > Thanks, Chesnay. I updated the PR to use a separate module now, and ran
> > it
> > > on a few modules (some Table API modules and a couple connectors). The
> CI
> > > seemed to take ~2.5min for executing the tests; that's certainly not
> > > negligible. On the other hand, even the few tests implemented already
> > found
> > > several violations ("several" is an understatement, but I manually
> > verified
> > > some of them, not all of them).
> > >
> > > On Mon, Sep 6, 2021 at 3:44 PM Chesnay Schepler <ches...@apache.org>
> > wrote:
> > >
> > >> While flink-tests is currently the best choice in that it has the
> > >> biggest classpath, it is also the module already requiring the most
> time
> > >> on CI.
> > >>
> > >> Furthermore, given that we ideally cover all APIs (including
> connectors
> > >> & formats), having that mess of dependencies in flink-tests may
> > >> interfere with existing / future tests.
> > >>
> > >> As such I would prefer a separate module, as annoying as that may be.
> > >>
> > >> On 06/09/2021 15:26, Ingo Bürk wrote:
> > >>> I just quickly chatted with the author/maintainer of ArchUnit, and a
> > >> module
> > >>> which depends on every module that should be tested seems to be the
> > best
> > >>> solution. How do you feel about using flink-tests for this vs.
> having a
> > >>> separate module for this purpose?
> > >>>
> > >>>
> > >>> Ingo
> > >>>
> > >>> On Mon, Sep 6, 2021 at 3:04 PM Ingo Bürk <i...@ververica.com> wrote:
> > >>>
> > >>>> Hi Chesnay,
> > >>>>
> > >>>> Those are all great questions, and I want to tackle those as well.
> For
> > >> the
> > >>>> moment I went per-module, but runtime-wise that isn't ideal the more
> > >>>> modules we'd activate this in. ArchUnit does cache classes between
> > >> tests,
> > >>>> but if we run them individually per module, we'd still add up quite
> a
> > >> bit
> > >>>> of execution time (a single module in my IDE is around 10s with the
> > >> tests I
> > >>>> currently have implemented, but I suspect the bottleneck here is the
> > >>>> importing of classes, not the number of tests). Ideally we'd just
> run
> > >> them
> > >>>> once in a module with a big enough classpath to cover everything. If
> > we
> > >>>> have such a place, that would probably be our best shot. I'll also
> > keep
> > >>>> investigating here, of course.
> > >>>>
> > >>>> For now I just pushed a solution to avoid the overlap when executing
> > it
> > >>>> per-module by matching on the URI. It's not the prettiest solution,
> > but
> > >>>> does work; but that's more to not fail the tests in unrelated
> modules
> > >> and
> > >>>> doesn't help much with execution time.
> > >>>>
> > >>>>
> > >>>> Ingo
> > >>>>
> > >>>> On Mon, Sep 6, 2021 at 1:57 PM Chesnay Schepler <ches...@apache.org
> >
> > >>>> wrote:
> > >>>>
> > >>>>> Do you have an estimate for long these tests would run for?
> > >>>>>
> > >>>>> For project-wide tests, what are the options for setting that up?
> > >>>>> If we let the tests run per-module then I guess they'd overlap
> > >>>>> considerably (because other Flink modules are being put on the
> > >>>>> classpath), which isn't ideal.
> > >>>>>
> > >>>>> On 06/09/2021 13:51, David Morávek wrote:
> > >>>>>> Hi Ingo,
> > >>>>>>
> > >>>>>> +1 for this effort. This could automate lot of "written rules"
> that
> > >> are
> > >>>>>> easy to forget about / not to be aware of (such as that each test
> > >> should
> > >>>>>> extend the TestLogger as Till has already mentioned).
> > >>>>>>
> > >>>>>> I went trough your examples and ArchUnit looks really powerful and
> > >>>>>> expressive while still being easy to read.
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> D.
> > >>>>>>
> > >>>>>> On Mon, Sep 6, 2021 at 1:00 PM Ingo Bürk <i...@ververica.com>
> > wrote:
> > >>>>>>
> > >>>>>>> Thanks for your input Chesnay!
> > >>>>>>>
> > >>>>>>> The limitations of ArchUnit probably mostly stem from the fact
> that
> > >> it
> > >>>>>>> operates on byte code and thus can't access anything not
> accessible
> > >>>>> from
> > >>>>>>> byte code, i.e. JavaDocs. But I think Checkstyle and ArchUnit are
> > >>>>>>> complementing each other quite well here. The main reason against
> > >>>>>>> Checkstyle for these tests is its limitation to single files
> only,
> > >>>>>>> rendering many tests (including the one you mentioned)
> impossible.
> > >> The
> > >>>>>>> secondary reason is that ArchUnit has more declarative APIs and
> the
> > >>>>> tests
> > >>>>>>> become quite easy to write and maintain (some groundwork effort
> is
> > >>>>> needed,
> > >>>>>>> of course). Over time we could probably expand quite a bit more
> on
> > >>>>> what is
> > >>>>>>> tested with ArchUnit as it can test entire architectures (package
> > >>>>> accesses
> > >>>>>>> etc.), and it has support for "freezing" known violations to
> > prevent
> > >>>>> new
> > >>>>>>> violations and removing existing ones over time.
> > >>>>>>>
> > >>>>>>> The @VisibleForTesting use case you mentioned is possible; I've
> > >> pushed
> > >>>>> a
> > >>>>>>> version of that rule to the draft PR now as well.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Best
> > >>>>>>> Ingo
> > >>>>>>>
> > >>>>>>> On Mon, Sep 6, 2021 at 12:11 PM Chesnay Schepler <
> > ches...@apache.org
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> This sounds like an interesting effort.
> > >>>>>>>>
> > >>>>>>>> The draft you have opened uses ArchUnit; can you explain a bit
> > what
> > >>>>> the
> > >>>>>>>> capabilities/limitations of said tool are?
> > >>>>>>>>
> > >>>>>>>> One thing we wanted to have for a long time is that
> > methods/classes
> > >>>>>>>> annotated with @VisibleForTesting are not called from production
> > >> code;
> > >>>>>>>> would that be something that could be implemented?
> > >>>>>>>>
> > >>>>>>>> It's not a problem imo that tests need to run in order to catch
> > >> stuff;
> > >>>>>>>> so long as it is noticed on CI.
> > >>>>>>>>
> > >>>>>>>> On 03/09/2021 08:48, Ingo Bürk wrote:
> > >>>>>>>>> Hi Timo, Till,
> > >>>>>>>>>
> > >>>>>>>>> thanks for your input already. I'm glad to hear that the idea
> > >>>>>>> resonates,
> > >>>>>>>>> also thanks for the additional ideas!
> > >>>>>>>>>
> > >>>>>>>>> I've created a JIRA issue[1] for now just to track this idea.
> I'm
> > >>>>> also
> > >>>>>>>>> working on a bit of a proof of concept and opened it as a draft
> > >>>>> PR[2].
> > >>>>>>>> I'm
> > >>>>>>>>> happy for anyone to join that PR to look and discuss. The PR
> > >> doesn't
> > >>>>>>>>> necessarily intend to be merged in its current state, but is
> > rather
> > >>>>> for
> > >>>>>>>>> evaluation.
> > >>>>>>>>>
> > >>>>>>>>> Meanwhile I'm also collecting ideas in a Google Doc, so if
> anyone
> > >>>>> wants
> > >>>>>>>> to
> > >>>>>>>>> suggest more use cases to explore or implement, please let me
> > know.
> > >>>>>>>>>
> > >>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-24138
> > >>>>>>>>> [2] https://github.com/apache/flink/pull/17133
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Best
> > >>>>>>>>> Ingo
> > >>>>>>>>>
> > >>>>>>>>> On Thu, Sep 2, 2021 at 12:31 PM Till Rohrmann <
> > >> trohrm...@apache.org>
> > >>>>>>>> wrote:
> > >>>>>>>>>> If it is possible to automate these kinds of checks, then I am
> > all
> > >>>>> for
> > >>>>>>>> it
> > >>>>>>>>>> because everything else breaks eventually. So +1 for this
> > >> proposal.
> > >>>>>>>>>> I don't have experience with what tools are available, though.
> > >>>>>>>>>>
> > >>>>>>>>>> I would like to add a rule that every test class extends
> > directly
> > >> or
> > >>>>>>>>>> indirectly TestLogger because otherwise it is super hard to
> make
> > >>>>> sense
> > >>>>>>>> of
> > >>>>>>>>>> the test logs (Arvid will probably chime in stating that this
> > will
> > >>>>> be
> > >>>>>>>>>> solved with Junit5 eventually).
> > >>>>>>>>>>
> > >>>>>>>>>> Not sure whether this is possible or not but if we can check
> > that
> > >>>>> all
> > >>>>>>>>>> interfaces have properly defined JavaDocs on each method, then
> > >> this
> > >>>>>>>> could
> > >>>>>>>>>> also be helpful in my opinion.
> > >>>>>>>>>>
> > >>>>>>>>>> Cheers,
> > >>>>>>>>>> Till
> > >>>>>>>>>>
> > >>>>>>>>>> On Thu, Sep 2, 2021 at 11:16 AM Timo Walther <
> > twal...@apache.org>
> > >>>>>>>> wrote:
> > >>>>>>>>>>> Hi Ingo,
> > >>>>>>>>>>>
> > >>>>>>>>>>> thanks for starting this discussion. Having more automation
> is
> > >>>>>>>>>>> definitely desirable. Esp. in the API / SDK areas where we
> > >>>>> frequently
> > >>>>>>>>>>> have to add similar comments to PRs. The more checks the
> > better.
> > >> We
> > >>>>>>>>>>> definitely also need more guidelines (e.g. how to develop a
> > Flink
> > >>>>>>>>>>> connector) but automation is safer then long checklists that
> > >> might
> > >>>>> be
> > >>>>>>>>>>> out of date quickly.
> > >>>>>>>>>>>
> > >>>>>>>>>>> +1 to the proposal. I don't have an opinion on the tool
> though.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards,
> > >>>>>>>>>>> Timo
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 01.09.21 11:03, Ingo Bürk wrote:
> > >>>>>>>>>>>> Hello everyone,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I would like to start a discussion on introducing automated
> > >> tests
> > >>>>>>> for
> > >>>>>>>>>>> more
> > >>>>>>>>>>>> architectural rather than stilistic topics. For example,
> here
> > >> are
> > >>>>> a
> > >>>>>>>> few
> > >>>>>>>>>>>> things that seem worth checking to me (this is
> > Table-API-focused
> > >>>>>>> since
> > >>>>>>>>>> it
> > >>>>>>>>>>>> is the subsystem I'm involved in):
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> (a) All classes in o.a.f.table.api should be annotated with
> > one
> > >>>>>>>>>>>> of @Internal, @PublicEvolving, or @Public.
> > >>>>>>>>>>>> (b) Classes whose name ends in *ConnectorOptions should be
> > >> located
> > >>>>>>> in
> > >>>>>>>>>>>> o.a.f.connector.*.table
> > >>>>>>>>>>>> (c) Classes implementing DynamicSourceFactory /
> > >> DynamicSinkFactory
> > >>>>>>>>>> should
> > >>>>>>>>>>>> have no static members of type ConfigOption
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> There are probably significantly more cases worth checking,
> > and
> > >>>>> also
> > >>>>>>>>>> more
> > >>>>>>>>>>>> involved ones (these are rather simple examples), like
> > >> disallowing
> > >>>>>>>>>> access
> > >>>>>>>>>>>> between certain packages etc. There are two questions I
> would
> > >> like
> > >>>>>>> to
> > >>>>>>>>>> ask
> > >>>>>>>>>>>> to the community:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> (1) Do you think such tests are useful in general?
> > >>>>>>>>>>>> (2) What use cases come to mind for you?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> If the idea finds consensus, I would like to use (2) to
> > >>>>> investigate
> > >>>>>>>>>> which
> > >>>>>>>>>>>> tooling to use. An obvious candidate is Checkstyle, as this
> is
> > >>>>>>> already
> > >>>>>>>>>>>> used. It also has the advantage of being well integrated in
> > the
> > >>>>> IDE.
> > >>>>>>>>>>>> However, it is limited to looking at single files only, and
> > >> custom
> > >>>>>>>>>> checks
> > >>>>>>>>>>>> are pretty complicated and involved to implement[1]. Another
> > >>>>>>> possible
> > >>>>>>>>>>> tool
> > >>>>>>>>>>>> is ArchUnit[2], which would be significantly easier to
> > maintain
> > >>>>> and
> > >>>>>>> is
> > >>>>>>>>>>> more
> > >>>>>>>>>>>> powerful, but in turn requires tests to be executed. If you
> > have
> > >>>>>>>>>> further
> > >>>>>>>>>>>> suggestions (or thoughts) they would of course also be quite
> > >>>>>>> welcome,
> > >>>>>>>>>>>> though for now I would focus on (1) and (2) and go from
> there
> > to
> > >>>>>>>>>>> evaluate.
> > >>>>>>>>>>>> [1] https://checkstyle.sourceforge.io/writingchecks.html
> > >>>>>>>>>>>> [2] https://www.archunit.org/
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best
> > >>>>>>>>>>>> Ingo
> > >>>>>>>>>>>>
> > >>
> >
> >
>

Reply via email to