> > It's hard to guess how many test sets are required and how many > extra lines of "marker code" are needed for each category and how the Venn > diagrams work out.
I believe (but that's mostly gut feeling) that significant majority of our tests will fall into "no-marker" camp. @Tomasz Urbaszek <tomasz.urbas...@polidea.com> -> maybe you have a bit more breakdown of the time of tests faster than 1s (currently 85% of tests) ? I think all tests faster than < 0.1 s or so will be the "no marker" type of tests. I think for sure we have the "integrations" and "backend" categories I mentioned and "system" ones in the future. I did some quick calculations for GCP tests from 1.10 (it's approximate as we have so far no simple way of grouping the tests and it does not even include parameterized tests): - All GCP tests: 352 operators, 453 hooks, 16 sensors = 821 tests - Fake GCP tests: 1 test using boto and real S3 that can be converted to moto. (0.1 %) - Integration GCP tests: 0 - no tests requiring external integrations. (0%) - Backend tests: Postgres: 4, Mysql 7, Sqlite = 11 (1%) - System GCP tests: 22 operators + 18 hooks = 40 tests (4%) So we are really talking about small number. It might be different for other tests but I think the vast majority of tests will be "no-markers" tests. > I don't want to get into that as I don't have > familiarity with all of it, but my first intuition is that markers will > provide granularity at the expense of a lot more "marker code", unless > there is always a common default test-env and extra tests are only required > for the exceptions to the defaults.) > I think it's not that bad. In most cases (if not all) markers we are talking about in integration/fake cases are "per-class" rather than per test-method. And we can use markers on a class level even if it is not explicitly mentioned in pytest docs. See the last example here: https://www.seleniumeasy.com/python/grouping-tests-with-pytest-mark > > How would the proposed marker scheme categorise a test that uses mocked > infrastructure for AWS batch services? Consider how much AWS > infrastructure is mocked in a moto server to test batch services, i.e. see > [1,2]. In a real sense, the moto library provides a server with a > container runtime, it's "mocked infrastructure" that helps to "fake > integration" tests. For me it clearly falls into "fake" category. It does not need real service, nor it mocks/stubs the methods - it provides fake implementation of s3 services. That's a good example underlining my point where our industry failed in test terminology area. We have mock/stubs/fakes/doubles/... and we have unit/smoke/integration/system/e2e tests and we cannot agree what those terms mean :D. If we follow the terminology of Mark Fowler moto library should be called foto :) > +1 for a common vocabulary (semantics) for tests and > markers; I'm not a test-expert by a long shot, so what is the best practice > for a test vocabulary and how does it translate into markers? Does the > Apache Foundation have any kind of manifesto about such things? > That's the whole problem that there are so many competing and contradicting best practices/terms. Almost as many as projects, languages, test frameworks :D. I tried to find one that ASF would recommend but (not surprisingly) could not find any. I think it's really something that is hard to standardise at the Foundation level - precisely because there are so many competing approaches and so many frameworks that are using different ideas/terms, so it would be a futile effort. And I think it should be left to project-level decision (i.e. community decides on terminology used). So here we are... Since we moved to pytest, I think we are at the point where we started to discuss options we have. Then we should propose some terminology (consistent approach), vote on it, document and follow it as community. J. -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>