I have a concrete proposal that we can start with. It's not a final set of
markers we might want to have but one that we can start with and make an
immediate use of.

I would like to adapt our tests to be immediately usable in Breeze (and
tied with it) and follow this approach:

*Proposed Breeze changes:*

   - `./breeze` by default will start only the main 'airflow-testing'
   image. This way no huge resource usage will be needed when breeze is
   started by default
   - './breeze --all-integrations` will start all dependent images (so we
   will be able to run all tests)
   - './breeze --integrations [kubernetes,cassandra,mongo,
   rabbitmq,redis,openldap,kerberos] - you will be able to choose which
   integrations you want to start
   - When you run `breeze --backend postgres` it will only start postgres
   not mysql and the other way round.

*Proposed Pytest marks:*

   -
   
pytest.mark.integrations('kubernetes'),pytest.mark.integrations('cassandra'),.....
   - pytest,mark.backends("postgres"), pytest,mark.backends("mysql"),
   pytest.mark.backends("sqlite")

It's very easy to add custom switches to pytest and auto-detect what is the
default setting based on environment variables for example. We could follow
https://docs.pytest.org/en/latest/example/markers.html#custom-marker-and-command-line-option-to-control-test-runs
.

*Proposed Pytest behaviour:*

   - `pytest` -> in Breeze will run all tests that are applicable within
   the current environment:
      - it will only run non-marked tests by default, applicable with
      current selected backend
      - when (for example) you stared cassandra is added it will
      additionally run pytest.mark.integrations('cassandra')
   - `pytest` in local environment by default will only run non-marked tests
   - `pytest --integrations [kubernetes, ....]` will only run the
   integration tests selected (will convert the switch into the corresponding
   markers (as explained in the example above)
   - `pytest --backends [postgres| mysql | sqlite] will only run the
   specific tests that use postgres/mysql/sqlite specific tests

*What we will achieve by that:*

   - lower resource usage by Breeze by default (while allowing to run most
   of the tests)
   - easy selection of integration(s) we want to test
   - easy way to run all tests to reproduce CI run
   - capability of running just 'pytest' and testing (as fast as possible)
   all the tests that are applicable in your environment (if you want to be
   extra-sure everything works - for example during refactoring)
   - in the future we might be able to optimise CI and run smaller set of
   tests for postgres/mysql/sqlite 'only' cases - optimising the time for CI
   builds.


If I will get a general "OK" from community for that - I can make a set of
incremental changes to breeze (as I continue working on prod image) and add
those capabilities to Breeze.

J.






On Wed, Dec 18, 2019 at 1:10 AM Kamil Breguła <kamil.breg...@polidea.com>
wrote:

> It is worth adding that we currently use test marking in the project. For
> this purpose, we use the prefix "_system.py" in the file name.
> Unit tests:
>
> https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs.py
> System tests:
>
> https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs_operator_system.py
> Elsewhere, a special directory structure is used.
> Unit tests: https://github.com/apache/airflow/tree/master/tests/kubernetes
> Integration tests:
> https://github.com/apache/airflow/tree/master/tests/integration/kubernetes
>
> This will allow us to limit e.g. mocking in system tests.
> This seems to be a clearer solution because it clearly separates each type
> of test. If we add markers, they may not be noticed when making changes and
> review. The file name is immediately visible.
> Recently I dealt with such a case that system tests included mocking, which
> by definition did not work.
>
> https://github.com/apache/airflow/commit/11262c6d42c4612890a6eec71783e0a6d5b22c17
>
>
> On Tue, Dec 10, 2019 at 2:22 PM Jarek Potiuk <jarek.pot...@polidea.com>
> wrote:
>
> > I am all-in for markers.
> >
> > I think we should start with small set of useful markers, which should
> have
> > a useful purpose from the beginning and implement them first - to learn
> how
> > useful they are (before we decide on full set of markers).
> > Otherwise maintaining those markers will become a fruitless "chore" and
> it
> > might be abandoned.
> >
> > So my proposal is to agree the first top cases we want to handle with
> > markers and then define/apply the markers accordingly:
> >
> > Those are my three top priorities (from most important to least):
> >
> >    - Splitting out the Integration tests (and updating Breeze) so that
> you
> >    choose which integration you start when you start Breeze rather than
> > start
> >    them all.
> >    - DB separation so that we do not repeat non-DB tests on all
> Databases.
> >    - Proper separation of Kubernetes tests (They are now filtered out
> based
> >    on skipif/env variables.
> >
> >
> > J.
> >
> >
> > On Tue, Dec 10, 2019 at 1:32 PM Tomasz Urbaszek <
> > tomasz.urbas...@polidea.com>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > Since we run our tests using pytest we are able to use test markers
> [1].
> > > Using them will give
> > > use some useful things:
> > > - additional information of test type (ex. when used for system test)
> > > - easy way to select test by types (ex. pytest -v -m "not system")
> > > - way to split our test suite in more effective way (no need to run all
> > > tests on 3 backends)
> > >
> > > I would like to discuss what "official" marks would we like to use. As
> a
> > > base I would suggests
> > > to mark tests as:
> > > - system - tests that need the outside world to be successful (ex. GCP
> > > system tests)
> > > - db[postgres, sqlite, mysql] - tests that require database to be
> > > successful, in other words,
> > > tests that create some db side effects
> > > - integration - tests that requires some additional resources like
> > > Cassandra or Kubernetes
> > >
> > > All other, unmarked tests would be treated as "pure" meaning that they
> > have
> > > no side effects
> > > (at least on database level).
> > >
> > > What do you think about this? Does anyone have some experience with
> using
> > > markers in
> > > such a big project?
> > >
> > > [1] http://doc.pytest.org/en/latest/example/markers.html
> > >
> > >
> > > Bests,
> > > Tomek Urbaszek
> > >
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
> >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to