Thanks, Chesnay. I updated the PR to use a separate module now, and ran it on a few modules (some Table API modules and a couple connectors). The CI seemed to take ~2.5min for executing the tests; that's certainly not negligible. On the other hand, even the few tests implemented already found several violations ("several" is an understatement, but I manually verified some of them, not all of them).
On Mon, Sep 6, 2021 at 3:44 PM Chesnay Schepler <ches...@apache.org> wrote: > While flink-tests is currently the best choice in that it has the > biggest classpath, it is also the module already requiring the most time > on CI. > > Furthermore, given that we ideally cover all APIs (including connectors > & formats), having that mess of dependencies in flink-tests may > interfere with existing / future tests. > > As such I would prefer a separate module, as annoying as that may be. > > On 06/09/2021 15:26, Ingo Bürk wrote: > > I just quickly chatted with the author/maintainer of ArchUnit, and a > module > > which depends on every module that should be tested seems to be the best > > solution. How do you feel about using flink-tests for this vs. having a > > separate module for this purpose? > > > > > > Ingo > > > > On Mon, Sep 6, 2021 at 3:04 PM Ingo Bürk <i...@ververica.com> wrote: > > > >> Hi Chesnay, > >> > >> Those are all great questions, and I want to tackle those as well. For > the > >> moment I went per-module, but runtime-wise that isn't ideal the more > >> modules we'd activate this in. ArchUnit does cache classes between > tests, > >> but if we run them individually per module, we'd still add up quite a > bit > >> of execution time (a single module in my IDE is around 10s with the > tests I > >> currently have implemented, but I suspect the bottleneck here is the > >> importing of classes, not the number of tests). Ideally we'd just run > them > >> once in a module with a big enough classpath to cover everything. If we > >> have such a place, that would probably be our best shot. I'll also keep > >> investigating here, of course. > >> > >> For now I just pushed a solution to avoid the overlap when executing it > >> per-module by matching on the URI. It's not the prettiest solution, but > >> does work; but that's more to not fail the tests in unrelated modules > and > >> doesn't help much with execution time. > >> > >> > >> Ingo > >> > >> On Mon, Sep 6, 2021 at 1:57 PM Chesnay Schepler <ches...@apache.org> > >> wrote: > >> > >>> Do you have an estimate for long these tests would run for? > >>> > >>> For project-wide tests, what are the options for setting that up? > >>> If we let the tests run per-module then I guess they'd overlap > >>> considerably (because other Flink modules are being put on the > >>> classpath), which isn't ideal. > >>> > >>> On 06/09/2021 13:51, David Morávek wrote: > >>>> Hi Ingo, > >>>> > >>>> +1 for this effort. This could automate lot of "written rules" that > are > >>>> easy to forget about / not to be aware of (such as that each test > should > >>>> extend the TestLogger as Till has already mentioned). > >>>> > >>>> I went trough your examples and ArchUnit looks really powerful and > >>>> expressive while still being easy to read. > >>>> > >>>> Best, > >>>> D. > >>>> > >>>> On Mon, Sep 6, 2021 at 1:00 PM Ingo Bürk <i...@ververica.com> wrote: > >>>> > >>>>> Thanks for your input Chesnay! > >>>>> > >>>>> The limitations of ArchUnit probably mostly stem from the fact that > it > >>>>> operates on byte code and thus can't access anything not accessible > >>> from > >>>>> byte code, i.e. JavaDocs. But I think Checkstyle and ArchUnit are > >>>>> complementing each other quite well here. The main reason against > >>>>> Checkstyle for these tests is its limitation to single files only, > >>>>> rendering many tests (including the one you mentioned) impossible. > The > >>>>> secondary reason is that ArchUnit has more declarative APIs and the > >>> tests > >>>>> become quite easy to write and maintain (some groundwork effort is > >>> needed, > >>>>> of course). Over time we could probably expand quite a bit more on > >>> what is > >>>>> tested with ArchUnit as it can test entire architectures (package > >>> accesses > >>>>> etc.), and it has support for "freezing" known violations to prevent > >>> new > >>>>> violations and removing existing ones over time. > >>>>> > >>>>> The @VisibleForTesting use case you mentioned is possible; I've > pushed > >>> a > >>>>> version of that rule to the draft PR now as well. > >>>>> > >>>>> > >>>>> Best > >>>>> Ingo > >>>>> > >>>>> On Mon, Sep 6, 2021 at 12:11 PM Chesnay Schepler <ches...@apache.org > > > >>>>> wrote: > >>>>> > >>>>>> This sounds like an interesting effort. > >>>>>> > >>>>>> The draft you have opened uses ArchUnit; can you explain a bit what > >>> the > >>>>>> capabilities/limitations of said tool are? > >>>>>> > >>>>>> One thing we wanted to have for a long time is that methods/classes > >>>>>> annotated with @VisibleForTesting are not called from production > code; > >>>>>> would that be something that could be implemented? > >>>>>> > >>>>>> It's not a problem imo that tests need to run in order to catch > stuff; > >>>>>> so long as it is noticed on CI. > >>>>>> > >>>>>> On 03/09/2021 08:48, Ingo Bürk wrote: > >>>>>>> Hi Timo, Till, > >>>>>>> > >>>>>>> thanks for your input already. I'm glad to hear that the idea > >>>>> resonates, > >>>>>>> also thanks for the additional ideas! > >>>>>>> > >>>>>>> I've created a JIRA issue[1] for now just to track this idea. I'm > >>> also > >>>>>>> working on a bit of a proof of concept and opened it as a draft > >>> PR[2]. > >>>>>> I'm > >>>>>>> happy for anyone to join that PR to look and discuss. The PR > doesn't > >>>>>>> necessarily intend to be merged in its current state, but is rather > >>> for > >>>>>>> evaluation. > >>>>>>> > >>>>>>> Meanwhile I'm also collecting ideas in a Google Doc, so if anyone > >>> wants > >>>>>> to > >>>>>>> suggest more use cases to explore or implement, please let me know. > >>>>>>> > >>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-24138 > >>>>>>> [2] https://github.com/apache/flink/pull/17133 > >>>>>>> > >>>>>>> > >>>>>>> Best > >>>>>>> Ingo > >>>>>>> > >>>>>>> On Thu, Sep 2, 2021 at 12:31 PM Till Rohrmann < > trohrm...@apache.org> > >>>>>> wrote: > >>>>>>>> If it is possible to automate these kinds of checks, then I am all > >>> for > >>>>>> it > >>>>>>>> because everything else breaks eventually. So +1 for this > proposal. > >>>>>>>> > >>>>>>>> I don't have experience with what tools are available, though. > >>>>>>>> > >>>>>>>> I would like to add a rule that every test class extends directly > or > >>>>>>>> indirectly TestLogger because otherwise it is super hard to make > >>> sense > >>>>>> of > >>>>>>>> the test logs (Arvid will probably chime in stating that this will > >>> be > >>>>>>>> solved with Junit5 eventually). > >>>>>>>> > >>>>>>>> Not sure whether this is possible or not but if we can check that > >>> all > >>>>>>>> interfaces have properly defined JavaDocs on each method, then > this > >>>>>> could > >>>>>>>> also be helpful in my opinion. > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> Till > >>>>>>>> > >>>>>>>> On Thu, Sep 2, 2021 at 11:16 AM Timo Walther <twal...@apache.org> > >>>>>> wrote: > >>>>>>>>> Hi Ingo, > >>>>>>>>> > >>>>>>>>> thanks for starting this discussion. Having more automation is > >>>>>>>>> definitely desirable. Esp. in the API / SDK areas where we > >>> frequently > >>>>>>>>> have to add similar comments to PRs. The more checks the better. > We > >>>>>>>>> definitely also need more guidelines (e.g. how to develop a Flink > >>>>>>>>> connector) but automation is safer then long checklists that > might > >>> be > >>>>>>>>> out of date quickly. > >>>>>>>>> > >>>>>>>>> +1 to the proposal. I don't have an opinion on the tool though. > >>>>>>>>> > >>>>>>>>> Regards, > >>>>>>>>> Timo > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 01.09.21 11:03, Ingo Bürk wrote: > >>>>>>>>>> Hello everyone, > >>>>>>>>>> > >>>>>>>>>> I would like to start a discussion on introducing automated > tests > >>>>> for > >>>>>>>>> more > >>>>>>>>>> architectural rather than stilistic topics. For example, here > are > >>> a > >>>>>> few > >>>>>>>>>> things that seem worth checking to me (this is Table-API-focused > >>>>> since > >>>>>>>> it > >>>>>>>>>> is the subsystem I'm involved in): > >>>>>>>>>> > >>>>>>>>>> (a) All classes in o.a.f.table.api should be annotated with one > >>>>>>>>>> of @Internal, @PublicEvolving, or @Public. > >>>>>>>>>> (b) Classes whose name ends in *ConnectorOptions should be > located > >>>>> in > >>>>>>>>>> o.a.f.connector.*.table > >>>>>>>>>> (c) Classes implementing DynamicSourceFactory / > DynamicSinkFactory > >>>>>>>> should > >>>>>>>>>> have no static members of type ConfigOption > >>>>>>>>>> > >>>>>>>>>> There are probably significantly more cases worth checking, and > >>> also > >>>>>>>> more > >>>>>>>>>> involved ones (these are rather simple examples), like > disallowing > >>>>>>>> access > >>>>>>>>>> between certain packages etc. There are two questions I would > like > >>>>> to > >>>>>>>> ask > >>>>>>>>>> to the community: > >>>>>>>>>> > >>>>>>>>>> (1) Do you think such tests are useful in general? > >>>>>>>>>> (2) What use cases come to mind for you? > >>>>>>>>>> > >>>>>>>>>> If the idea finds consensus, I would like to use (2) to > >>> investigate > >>>>>>>> which > >>>>>>>>>> tooling to use. An obvious candidate is Checkstyle, as this is > >>>>> already > >>>>>>>>>> used. It also has the advantage of being well integrated in the > >>> IDE. > >>>>>>>>>> However, it is limited to looking at single files only, and > custom > >>>>>>>> checks > >>>>>>>>>> are pretty complicated and involved to implement[1]. Another > >>>>> possible > >>>>>>>>> tool > >>>>>>>>>> is ArchUnit[2], which would be significantly easier to maintain > >>> and > >>>>> is > >>>>>>>>> more > >>>>>>>>>> powerful, but in turn requires tests to be executed. If you have > >>>>>>>> further > >>>>>>>>>> suggestions (or thoughts) they would of course also be quite > >>>>> welcome, > >>>>>>>>>> though for now I would focus on (1) and (2) and go from there to > >>>>>>>>> evaluate. > >>>>>>>>>> [1] https://checkstyle.sourceforge.io/writingchecks.html > >>>>>>>>>> [2] https://www.archunit.org/ > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Best > >>>>>>>>>> Ingo > >>>>>>>>>> > >>> > >