Re: Grouping tests using pytest markers

2020-01-07 Thread Jarek Potiuk
Since we need to tackle CI stability I would like to bump this one if
anyone wants to say something :) I am full speed ahead with implementing
those.

*Proposed Breeze changes:*
>
>- `./breeze` by default will start only the main 'airflow-testing'
>image. This way no huge resource usage will be needed when breeze is
>started by default
>- './breeze --all-integrations` will start all dependent images (so we
>will be able to run all tests)
>- './breeze --integrations [kubernetes,cassandra,mongo,
>rabbitmq,redis,openldap,kerberos] - you will be able to choose which
>integrations you want to start
>- When you run `breeze --backend postgres` it will only start postgres
>not mysql and the other way round.
>
> I have PR in progress: https://github.com/apache/airflow/pull/7091
(depends on few others)

After this is merged ./breeze will only start 'airflow-testing' image. You
will be able to launch other docker images (mongo/cassandra and others)
with *--integration mongo --integration cassandra* etc. (or --integration
all to launch all of them). This will be great for local testing (resource
usage!). This will also work in CI ( I will split the test jobs into
separate ones).


> *Proposed Pytest marks:*
>
>-
>
> pytest.mark.integrations('kubernetes'),pytest.mark.integrations('cassandra'),.
>- pytest,mark.backends("postgres"), pytest,mark.backends("mysql"),
>pytest.mark.backends("sqlite")
>
> During tests I will identify the tests that need particular integrations
and will mark/skip them appropriately and work out the right pytest
behaviour.

*Proposed Pytest behaviour:*
>
>- `pytest` -> in Breeze will run all tests that are applicable within
>the current environment:
>   - it will only run non-marked tests by default, applicable with
>   current selected backend
>   - when (for example) you stared cassandra is added it will
>   additionally run pytest.mark.integrations('cassandra')
>- `pytest` in local environment by default will only run non-marked
>tests
>- `pytest --integrations [kubernetes, ]` will only run the
>integration tests selected (will convert the switch into the corresponding
>markers (as explained in the example above)
>- `pytest --backends [postgres| mysql | sqlite] will only run the
>specific tests that use postgres/mysql/sqlite specific tests
>
> More details when I get to this one. Ideally all should be autodetected -
i.e. when you have no integration enabled, the corresponding tests should
be skipped, we should also be able to run tests for particular integration
or selected integrations with one command.

J.


Re: Grouping tests using pytest markers

2019-12-29 Thread Jarek Potiuk
>
> It's hard to guess how many test sets are required and how many
> extra lines of "marker code" are needed for each category and how the Venn
> diagrams work out.


I believe (but that's mostly gut feeling) that significant majority of our
tests
will fall into "no-marker" camp. @Tomasz Urbaszek
  -> maybe you have a
bit more breakdown of the time of tests faster than 1s (currently 85% of
tests) ?
I think all tests faster than < 0.1 s or so will be the "no marker" type of
tests.

I think for sure we have the "integrations" and "backend" categories I
mentioned and "system"
ones in the future.

I did some quick calculations for GCP tests from 1.10 (it's approximate as
we have
so far no simple way of grouping the tests and it does not even include
parameterized tests):

   - All GCP tests: 352 operators, 453 hooks, 16 sensors = 821 tests
   - Fake GCP tests: 1 test using boto and real S3  that can be converted
   to moto. (0.1 %)
   - Integration GCP tests: 0 - no tests requiring external integrations.
   (0%)
   - Backend tests: Postgres: 4, Mysql 7, Sqlite  = 11 (1%)
   - System GCP tests: 22 operators + 18 hooks = 40 tests (4%)

So we are really talking about small number. It might be different for
other tests but I think the vast majority of
tests will be "no-markers" tests.


> I don't want to get into that as I don't have
> familiarity with all of it, but my first intuition is that markers will
> provide granularity at the expense of a lot more "marker code", unless
> there is always a common default test-env and extra tests are only required
> for the exceptions to the defaults.)
>

I think it's not that bad. In most cases (if not all) markers we are
talking about in integration/fake cases
are "per-class" rather than per test-method. And we can use markers on a
class level even if it is not
explicitly mentioned in pytest docs. See the last example here:
 https://www.seleniumeasy.com/python/grouping-tests-with-pytest-mark


>
> How would the proposed marker scheme categorise a test that uses mocked
> infrastructure for AWS batch services?  Consider how much AWS
> infrastructure is mocked in a moto server to test batch services, i.e. see
> [1,2].  In a real sense, the moto library provides a server with a
> container runtime, it's "mocked infrastructure" that helps to "fake
> integration" tests.


For me it clearly falls into "fake" category. It does not need real service,
nor it mocks/stubs the methods - it provides fake implementation of s3
services. That's
a good example underlining my point where our industry failed in test
terminology area.
We have mock/stubs/fakes/doubles/... and we have
unit/smoke/integration/system/e2e
tests and we cannot agree what those terms mean :D. If we follow the
terminology of Mark Fowler moto library should be called foto :)


> +1 for a common vocabulary (semantics) for tests and
> markers; I'm not a test-expert by a long shot, so what is the best practice
> for a test vocabulary and how does it translate into markers?  Does the
> Apache Foundation have any kind of manifesto about such things?
>

That's the whole problem that there are so many competing and contradicting
best practices/terms.
Almost as many as projects, languages, test frameworks :D. I tried to find
one that ASF would
recommend but (not surprisingly) could not find any. I think it's really
something that is hard to
standardise at the Foundation level - precisely because there are so many
competing approaches
and so many frameworks that are using different ideas/terms, so it would be
a futile effort. And
I think it should be left to project-level decision (i.e. community decides
on terminology used).

So here we are...

Since we moved to pytest, I think we are at the point where we started to
discuss options we have.
Then we should  propose some terminology (consistent approach), vote on it,
document and follow it as community.

J.

-- 

Jarek Potiuk
Polidea  | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] 


Re: Grouping tests using pytest markers

2019-12-29 Thread Darren Weber
The link to
https://docs.pytest.org/en/latest/example/markers.html#custom-marker-and-command-line-option-to-control-test-runs
helps
to clarify some of the customization required to add CLI options that
select test sets based on markers.  +1 for a common default with *no
marker*.  (It's hard to guess how many test sets are required and how many
extra lines of "marker code" are needed for each category and how the Venn
diagrams work out.  I don't want to get into that as I don't have
familiarity with all of it, but my first intuition is that markers will
provide granularity at the expense of a lot more "marker code", unless
there is always a common default test-env and extra tests are only required
for the exceptions to the defaults.)

How would the proposed marker scheme categorise a test that uses mocked
infrastructure for AWS batch services?  Consider how much AWS
infrastructure is mocked in a moto server to test batch services, i.e. see
[1,2].  In a real sense, the moto library provides a server with a
container runtime, it's "mocked infrastructure" that helps to "fake
integration" tests.  +1 for a common vocabulary (semantics) for tests and
markers; I'm not a test-expert by a long shot, so what is the best practice
for a test vocabulary and how does it translate into markers?  Does the
Apache Foundation have any kind of manifesto about such things?

[1]
https://github.com/spulec/moto/blob/master/tests/test_batch/test_batch.py
[2] https://github.com/spulec/moto/blob/master/moto/batch/models.py


On Sun, Dec 29, 2019 at 7:48 AM Jarek Potiuk 
wrote:

> >
> > If I understand correctly, using `pytest -k` might be less work and more
> > generalized than a swag of custom makers, unless it entails a lot of
> > re-naming things.  The work to add markers might be easier if they can be
> > applied to entire classes of tests, although what I've mostly seen with
> > `pytest` is a functional pattern rather than classes in tests.  For more
> > about that, see the note about using pytest fixtures vs. class
> > setup/teardown at https://docs.pytest.org/en/latest/xunit_setup.html
>
>
> I think `pytest -k` is great for ad-hoc/manual execution of only what we
> want. But for automation around running tests (which should be repeatable
> and reproducible by anyone), I think it makes much more sense to keep
> makers in the code.
>
> It's really just a matter where we keep information about how we group
> tests in common categories that we use for test execution.
>
>1. with pytest -k - we would have to keep the "grouping" as different
>set of  -k parameters in CI test scripts. This requires following naming
>conventions for modules or classes or tests. Similar to what Kamil
>described earlier in the thread: we already use *_system.py module +
>SystemTest class naming in GCP tests.
>2. with markers, the grouping is kept in the source code of tests
>instead. This is a "meta" information that does not force any naming
>convention on the tests.
>
> I strongly prefer 2. over 1. for test automation.
>
> Some reasoning:
>
>- It makes it easier to reproduce grouping locally without having to
>look-up the selection criteria/naming conventions.
>- It's easier to make automation around it (for example in case of
>integrations we can easily select cases where "integration" from
>environment matches the integration marker. For example cassandra
>integration will be matched by integration("cassandra") marker. With
> naming
>convention we would have to record somewhere (in the custom -k command)
>that "cassandra" integration matches (for example) all tests in
>"tests.cassandra" package, or all tests named TestCassandra or something
>even more complex. Defining custom marker seems like much more obvious
> and
>easy to follow.
>- Naming conventions are sometimes not obvious when you look at the code
>- as opposed to markers are quite obvious to follow in the code when you
>add new tests of the same "category".
>- Last but not least - you can combine different markers together. For
>example we can have Cassandra (integration) + MySql (backend) tests. So
>markers are "labels" and you can apply more of them to the same test.
>Naming convention makes it difficult (or impossible) to combine
> different
>categories together - You would have to have non-overlapping conventions
>and as we add more categories it might become impossible. For example if
>you look at my proposal below  - we will likely have a number of
>"system(gcp)" and "backend("postgres") tests for tests that are testing
>System tests for Postgres to BigQuery.
>
> For me, the last reason from the list above is a deal-breaker. I can very
> easily imagine overlapping categories of tests we come up with and markers
> give us great flexibility here.
>
> With regard to "slow" and https://github.com/apache/airflow/pull/6876, it
> > was motivated by one test that uses 

Re: Grouping tests using pytest markers

2019-12-29 Thread Jarek Potiuk
>
> If I understand correctly, using `pytest -k` might be less work and more
> generalized than a swag of custom makers, unless it entails a lot of
> re-naming things.  The work to add markers might be easier if they can be
> applied to entire classes of tests, although what I've mostly seen with
> `pytest` is a functional pattern rather than classes in tests.  For more
> about that, see the note about using pytest fixtures vs. class
> setup/teardown at https://docs.pytest.org/en/latest/xunit_setup.html


I think `pytest -k` is great for ad-hoc/manual execution of only what we
want. But for automation around running tests (which should be repeatable
and reproducible by anyone), I think it makes much more sense to keep
makers in the code.

It's really just a matter where we keep information about how we group
tests in common categories that we use for test execution.

   1. with pytest -k - we would have to keep the "grouping" as different
   set of  -k parameters in CI test scripts. This requires following naming
   conventions for modules or classes or tests. Similar to what Kamil
   described earlier in the thread: we already use *_system.py module +
   SystemTest class naming in GCP tests.
   2. with markers, the grouping is kept in the source code of tests
   instead. This is a "meta" information that does not force any naming
   convention on the tests.

I strongly prefer 2. over 1. for test automation.

Some reasoning:

   - It makes it easier to reproduce grouping locally without having to
   look-up the selection criteria/naming conventions.
   - It's easier to make automation around it (for example in case of
   integrations we can easily select cases where "integration" from
   environment matches the integration marker. For example cassandra
   integration will be matched by integration("cassandra") marker. With naming
   convention we would have to record somewhere (in the custom -k command)
   that "cassandra" integration matches (for example) all tests in
   "tests.cassandra" package, or all tests named TestCassandra or something
   even more complex. Defining custom marker seems like much more obvious and
   easy to follow.
   - Naming conventions are sometimes not obvious when you look at the code
   - as opposed to markers are quite obvious to follow in the code when you
   add new tests of the same "category".
   - Last but not least - you can combine different markers together. For
   example we can have Cassandra (integration) + MySql (backend) tests. So
   markers are "labels" and you can apply more of them to the same test.
   Naming convention makes it difficult (or impossible) to combine different
   categories together - You would have to have non-overlapping conventions
   and as we add more categories it might become impossible. For example if
   you look at my proposal below  - we will likely have a number of
   "system(gcp)" and "backend("postgres") tests for tests that are testing
   System tests for Postgres to BigQuery.

For me, the last reason from the list above is a deal-breaker. I can very
easily imagine overlapping categories of tests we come up with and markers
give us great flexibility here.

With regard to "slow" and https://github.com/apache/airflow/pull/6876, it
> was motivated by one test that uses moto mocking for AWS batch services.
> In particular, it has a mock batch job that actually runs a container and
> the user of the mock has no control over how the job transitions from
> various job states (with associated status).  For example, the `pytest`
> durations are an order of magnitude longer for this test than all others
> (see below stdout from a PR branch of mine).  So, during dev-test cycles,
> once this test is coded and working as expected, it helps to either
> temporarily mark it with `pytest.mark.skip` or to permanently mark it with
> a custom marker (e.g. `pytest.mark.slow`) and then use the `pytest -m 'not
> slow'` to run all the faster tests.  It's no big deal, I can live without
> it, it's just a convenience.
>

With regards to "slow" tests. Maybe the right approach here will be to have
a different marker. I think "Slow" suggest that there is a "fast" somewhere
and that we need to know how slow is slow.

As an inspiration - I really like the distinction introduced by Martin
Fowler:
https://www.martinfowler.com/articles/mocksArentStubs.html#ClassicalAndMockistTesting
-
where he distinguishes between different types of "test doubles" (dummy,
fake, stub, spy, mock). Unfortunately, this terminology is not universally
accepted, but for the sake of this discussion - assume we follow it, then I
think the "fast" tests use "stubs, mocks or spies" where the "slow" tests
you mention use "fakes" (your scripts are really "fakes").
The "fake" tests are usually much slower. But the "fake" marker might not
be good name though because it's not universally agreed.

But maybe we can come up with something that indicates the tests that are
using "fakes" rather than 

Re: Grouping tests using pytest markers

2019-12-28 Thread Darren Weber
If I understand correctly, using `pytest -k` might be less work and more
generalized than a swag of custom makers, unless it entails a lot of
re-naming things.  The work to add markers might be easier if they can be
applied to entire classes of tests, although what I've mostly seen with
`pytest` is a functional pattern rather than classes in tests.  For more
about that, see the note about using pytest fixtures vs. class
setup/teardown at https://docs.pytest.org/en/latest/xunit_setup.html

With regard to "slow" and https://github.com/apache/airflow/pull/6876, it
was motivated by one test that uses moto mocking for AWS batch services.
In particular, it has a mock batch job that actually runs a container and
the user of the mock has no control over how the job transitions from
various job states (with associated status).  For example, the `pytest`
durations are an order of magnitude longer for this test than all others
(see below stdout from a PR branch of mine).  So, during dev-test cycles,
once this test is coded and working as expected, it helps to either
temporarily mark it with `pytest.mark.skip` or to permanently mark it with
a custom marker (e.g. `pytest.mark.slow`) and then use the `pytest -m 'not
slow'` to run all the faster tests.  It's no big deal, I can live without
it, it's just a convenience.

13.77s call
tests/providers/amazon/aws/hooks/test_batch_waiters.py::test_aws_batch_job_waiting
0.23s setup
 
tests/providers/amazon/aws/hooks/test_batch_waiters.py::test_aws_batch_job_waiting
0.11s call
tests/providers/amazon/aws/hooks/test_batch_client.py::TestAwsBatchClient::test_poll_job_complete_raises_for_max_retries
0.09s call
tests/providers/amazon/aws/hooks/test_batch_waiters.py::TestAwsBatchWaiters::test_wait_for_job_raises_for_client_error
0.01s call
tests/providers/amazon/aws/hooks/test_batch_client.py::TestAwsBatchClientDelays::test_exponential_delay_03

On Fri, Dec 27, 2019 at 11:39 AM Tomasz Urbaszek <
tomasz.urbas...@polidea.com> wrote:

> The suggestion of using -k flag is really interesting. It will require a
> lot of changes but adding
>  marks will require the same effort. However, I think that using a marker
> is more explicit and
> easier to spot.
>
> Regarding "slow test" marker, I did a quick calculation of tests execution
> times (execution and setups):
>
> Above 1s : 16.53%, time together: 18.29m
> Above 2s : 8.13%, time together: 14.43m
> Above 3s : 4.14%, time together: 11.54m
> Above 4s : 2.94%, time together: 10.3m
> Above 5s : 2.2%, time together: 9.3m
> Above 10s : 0.89%, time together: 6.51m
> Above 20s : 0.47%, time together: 4.53m
> Above 60s : 0.0%, time together: 0.0m
>
>
> Total time of the example build: 28m. I am not sure when a test is "slow".
> Moreover, I think there could be
> a difference in times between local environment (where developer will
> decide to use such marker) and
> the CI environment, thus resulting in a potential inconsistency.
>
> T.
>
> On Fri, Dec 27, 2019 at 7:58 PM Darren Weber 
> wrote:
>
> >
> >
> > Consider all the options for filtering tests:
> > - http://doc.pytest.org/en/latest/example/markers.html
> >
> > The `pytest -k` filters are very useful.  Provide guidelines on how to
> > name things so that `pytest -k` can be used to filter categories of
> tests.
> > Use markers for tests that might be the exception to the rule within a
> > module or class of tests (avoid the overhead of marking all the tests
> with
> > all the markers).  Consider using classes that contain all the same
> marker
> > (but a module and/or class name could serve this purpose too).
> >
> > -- Darren
> >
> > PS, I stumbled in this after creating
> > https://github.com/apache/airflow/pull/6876 to mark a slow test
> >
> > On 2019/12/27 11:30:09, Tomasz Urbaszek  wrote:
> > > +1 for integrations and backends, it's a good start ;)
> > >
> > > T.
> > >
> > > On Fri, Dec 27, 2019 at 12:16 PM Jarek Potiuk <
> jarek.pot...@polidea.com>
> > > wrote:
> > >
> > > > Since I am going to start working on it soon - I'd love to get some
> > > > opinions :).
> > > >
> > > > J.
> > > >
> > > > On Mon, Dec 23, 2019 at 11:13 AM Jarek Potiuk <
> > jarek.pot...@polidea.com>
> > > > wrote:
> > > >
> > > > > I have a concrete proposal that we can start with. It's not a final
> > set
> > > > of
> > > > > markers we might want to have but one that we can start with and
> > make an
> > > > > immediate use of.
> > > > >
> > > > > I would like to adapt our tests to be immediately usable in Breeze
> > (and
> > > > > tied with it) and follow this approach:
> > > > >
> > > > > *Proposed Breeze changes:*
> > > > >
> > > > >- `./breeze` by default will start only the main
> 'airflow-testing'
> > > > >image. This way no huge resource usage will be needed when
> breeze
> > is
> > > > >started by default
> > > > >- './breeze --all-integrations` will start all dependent images
> > (so we
> > > > >will be able to run all tests)
> > > > >- './breeze --integrations 

Re: Grouping tests using pytest markers

2019-12-27 Thread Tomasz Urbaszek
The suggestion of using -k flag is really interesting. It will require a
lot of changes but adding
 marks will require the same effort. However, I think that using a marker
is more explicit and
easier to spot.

Regarding "slow test" marker, I did a quick calculation of tests execution
times (execution and setups):

Above 1s : 16.53%, time together: 18.29m
Above 2s : 8.13%, time together: 14.43m
Above 3s : 4.14%, time together: 11.54m
Above 4s : 2.94%, time together: 10.3m
Above 5s : 2.2%, time together: 9.3m
Above 10s : 0.89%, time together: 6.51m
Above 20s : 0.47%, time together: 4.53m
Above 60s : 0.0%, time together: 0.0m


Total time of the example build: 28m. I am not sure when a test is "slow".
Moreover, I think there could be
a difference in times between local environment (where developer will
decide to use such marker) and
the CI environment, thus resulting in a potential inconsistency.

T.

On Fri, Dec 27, 2019 at 7:58 PM Darren Weber 
wrote:

>
>
> Consider all the options for filtering tests:
> - http://doc.pytest.org/en/latest/example/markers.html
>
> The `pytest -k` filters are very useful.  Provide guidelines on how to
> name things so that `pytest -k` can be used to filter categories of tests.
> Use markers for tests that might be the exception to the rule within a
> module or class of tests (avoid the overhead of marking all the tests with
> all the markers).  Consider using classes that contain all the same marker
> (but a module and/or class name could serve this purpose too).
>
> -- Darren
>
> PS, I stumbled in this after creating
> https://github.com/apache/airflow/pull/6876 to mark a slow test
>
> On 2019/12/27 11:30:09, Tomasz Urbaszek  wrote:
> > +1 for integrations and backends, it's a good start ;)
> >
> > T.
> >
> > On Fri, Dec 27, 2019 at 12:16 PM Jarek Potiuk 
> > wrote:
> >
> > > Since I am going to start working on it soon - I'd love to get some
> > > opinions :).
> > >
> > > J.
> > >
> > > On Mon, Dec 23, 2019 at 11:13 AM Jarek Potiuk <
> jarek.pot...@polidea.com>
> > > wrote:
> > >
> > > > I have a concrete proposal that we can start with. It's not a final
> set
> > > of
> > > > markers we might want to have but one that we can start with and
> make an
> > > > immediate use of.
> > > >
> > > > I would like to adapt our tests to be immediately usable in Breeze
> (and
> > > > tied with it) and follow this approach:
> > > >
> > > > *Proposed Breeze changes:*
> > > >
> > > >- `./breeze` by default will start only the main 'airflow-testing'
> > > >image. This way no huge resource usage will be needed when breeze
> is
> > > >started by default
> > > >- './breeze --all-integrations` will start all dependent images
> (so we
> > > >will be able to run all tests)
> > > >- './breeze --integrations [kubernetes,cassandra,mongo,
> > > >rabbitmq,redis,openldap,kerberos] - you will be able to choose
> which
> > > >integrations you want to start
> > > >- When you run `breeze --backend postgres` it will only start
> postgres
> > > >not mysql and the other way round.
> > > >
> > > > *Proposed Pytest marks:*
> > > >
> > > >-
> > > >
> > >
> pytest.mark.integrations('kubernetes'),pytest.mark.integrations('cassandra'),.
> > > >- pytest,mark.backends("postgres"), pytest,mark.backends("mysql"),
> > > >pytest.mark.backends("sqlite")
> > > >
> > > > It's very easy to add custom switches to pytest and auto-detect what
> is
> > > > the default setting based on environment variables for example. We
> could
> > > > follow
> > > >
> > >
> https://docs.pytest.org/en/latest/example/markers.html#custom-marker-and-command-line-option-to-control-test-runs
> > > > .
> > > >
> > > > *Proposed Pytest behaviour:*
> > > >
> > > >- `pytest` -> in Breeze will run all tests that are applicable
> within
> > > >the current environment:
> > > >   - it will only run non-marked tests by default, applicable with
> > > >   current selected backend
> > > >   - when (for example) you stared cassandra is added it will
> > > >   additionally run pytest.mark.integrations('cassandra')
> > > >- `pytest` in local environment by default will only run
> non-marked
> > > >tests
> > > >- `pytest --integrations [kubernetes, ]` will only run the
> > > >integration tests selected (will convert the switch into the
> > > corresponding
> > > >markers (as explained in the example above)
> > > >- `pytest --backends [postgres| mysql | sqlite] will only run the
> > > >specific tests that use postgres/mysql/sqlite specific tests
> > > >
> > > > *What we will achieve by that:*
> > > >
> > > >- lower resource usage by Breeze by default (while allowing to run
> > > >most of the tests)
> > > >- easy selection of integration(s) we want to test
> > > >- easy way to run all tests to reproduce CI run
> > > >- capability of running just 'pytest' and testing (as fast as
> > > >possible) all the tests that are applicable in 

Re: Grouping tests using pytest markers

2019-12-27 Thread Darren Weber



Consider all the options for filtering tests:
- http://doc.pytest.org/en/latest/example/markers.html

The `pytest -k` filters are very useful.  Provide guidelines on how to name 
things so that `pytest -k` can be used to filter categories of tests.  Use 
markers for tests that might be the exception to the rule within a module or 
class of tests (avoid the overhead of marking all the tests with all the 
markers).  Consider using classes that contain all the same marker (but a 
module and/or class name could serve this purpose too).

-- Darren

PS, I stumbled in this after creating 
https://github.com/apache/airflow/pull/6876 to mark a slow test

On 2019/12/27 11:30:09, Tomasz Urbaszek  wrote: 
> +1 for integrations and backends, it's a good start ;)
> 
> T.
> 
> On Fri, Dec 27, 2019 at 12:16 PM Jarek Potiuk 
> wrote:
> 
> > Since I am going to start working on it soon - I'd love to get some
> > opinions :).
> >
> > J.
> >
> > On Mon, Dec 23, 2019 at 11:13 AM Jarek Potiuk 
> > wrote:
> >
> > > I have a concrete proposal that we can start with. It's not a final set
> > of
> > > markers we might want to have but one that we can start with and make an
> > > immediate use of.
> > >
> > > I would like to adapt our tests to be immediately usable in Breeze (and
> > > tied with it) and follow this approach:
> > >
> > > *Proposed Breeze changes:*
> > >
> > >- `./breeze` by default will start only the main 'airflow-testing'
> > >image. This way no huge resource usage will be needed when breeze is
> > >started by default
> > >- './breeze --all-integrations` will start all dependent images (so we
> > >will be able to run all tests)
> > >- './breeze --integrations [kubernetes,cassandra,mongo,
> > >rabbitmq,redis,openldap,kerberos] - you will be able to choose which
> > >integrations you want to start
> > >- When you run `breeze --backend postgres` it will only start postgres
> > >not mysql and the other way round.
> > >
> > > *Proposed Pytest marks:*
> > >
> > >-
> > >
> > pytest.mark.integrations('kubernetes'),pytest.mark.integrations('cassandra'),.
> > >- pytest,mark.backends("postgres"), pytest,mark.backends("mysql"),
> > >pytest.mark.backends("sqlite")
> > >
> > > It's very easy to add custom switches to pytest and auto-detect what is
> > > the default setting based on environment variables for example. We could
> > > follow
> > >
> > https://docs.pytest.org/en/latest/example/markers.html#custom-marker-and-command-line-option-to-control-test-runs
> > > .
> > >
> > > *Proposed Pytest behaviour:*
> > >
> > >- `pytest` -> in Breeze will run all tests that are applicable within
> > >the current environment:
> > >   - it will only run non-marked tests by default, applicable with
> > >   current selected backend
> > >   - when (for example) you stared cassandra is added it will
> > >   additionally run pytest.mark.integrations('cassandra')
> > >- `pytest` in local environment by default will only run non-marked
> > >tests
> > >- `pytest --integrations [kubernetes, ]` will only run the
> > >integration tests selected (will convert the switch into the
> > corresponding
> > >markers (as explained in the example above)
> > >- `pytest --backends [postgres| mysql | sqlite] will only run the
> > >specific tests that use postgres/mysql/sqlite specific tests
> > >
> > > *What we will achieve by that:*
> > >
> > >- lower resource usage by Breeze by default (while allowing to run
> > >most of the tests)
> > >- easy selection of integration(s) we want to test
> > >- easy way to run all tests to reproduce CI run
> > >- capability of running just 'pytest' and testing (as fast as
> > >possible) all the tests that are applicable in your environment (if
> > you
> > >want to be extra-sure everything works - for example during
> > refactoring)
> > >- in the future we might be able to optimise CI and run smaller set of
> > >tests for postgres/mysql/sqlite 'only' cases - optimising the time
> > for CI
> > >builds.
> > >
> > >
> > > If I will get a general "OK" from community for that - I can make a set
> > of
> > > incremental changes to breeze (as I continue working on prod image) and
> > add
> > > those capabilities to Breeze.
> > >
> > > J.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Dec 18, 2019 at 1:10 AM Kamil Breguła  > >
> > > wrote:
> > >
> > >> It is worth adding that we currently use test marking in the project.
> > For
> > >> this purpose, we use the prefix "_system.py" in the file name.
> > >> Unit tests:
> > >>
> > >>
> > https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs.py
> > >> System tests:
> > >>
> > >>
> > https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs_operator_system.py
> > >> Elsewhere, a special directory structure is used.
> > >> Unit tests:
> > >> 

Re: Grouping tests using pytest markers

2019-12-27 Thread Tomasz Urbaszek
+1 for integrations and backends, it's a good start ;)

T.

On Fri, Dec 27, 2019 at 12:16 PM Jarek Potiuk 
wrote:

> Since I am going to start working on it soon - I'd love to get some
> opinions :).
>
> J.
>
> On Mon, Dec 23, 2019 at 11:13 AM Jarek Potiuk 
> wrote:
>
> > I have a concrete proposal that we can start with. It's not a final set
> of
> > markers we might want to have but one that we can start with and make an
> > immediate use of.
> >
> > I would like to adapt our tests to be immediately usable in Breeze (and
> > tied with it) and follow this approach:
> >
> > *Proposed Breeze changes:*
> >
> >- `./breeze` by default will start only the main 'airflow-testing'
> >image. This way no huge resource usage will be needed when breeze is
> >started by default
> >- './breeze --all-integrations` will start all dependent images (so we
> >will be able to run all tests)
> >- './breeze --integrations [kubernetes,cassandra,mongo,
> >rabbitmq,redis,openldap,kerberos] - you will be able to choose which
> >integrations you want to start
> >- When you run `breeze --backend postgres` it will only start postgres
> >not mysql and the other way round.
> >
> > *Proposed Pytest marks:*
> >
> >-
> >
> pytest.mark.integrations('kubernetes'),pytest.mark.integrations('cassandra'),.
> >- pytest,mark.backends("postgres"), pytest,mark.backends("mysql"),
> >pytest.mark.backends("sqlite")
> >
> > It's very easy to add custom switches to pytest and auto-detect what is
> > the default setting based on environment variables for example. We could
> > follow
> >
> https://docs.pytest.org/en/latest/example/markers.html#custom-marker-and-command-line-option-to-control-test-runs
> > .
> >
> > *Proposed Pytest behaviour:*
> >
> >- `pytest` -> in Breeze will run all tests that are applicable within
> >the current environment:
> >   - it will only run non-marked tests by default, applicable with
> >   current selected backend
> >   - when (for example) you stared cassandra is added it will
> >   additionally run pytest.mark.integrations('cassandra')
> >- `pytest` in local environment by default will only run non-marked
> >tests
> >- `pytest --integrations [kubernetes, ]` will only run the
> >integration tests selected (will convert the switch into the
> corresponding
> >markers (as explained in the example above)
> >- `pytest --backends [postgres| mysql | sqlite] will only run the
> >specific tests that use postgres/mysql/sqlite specific tests
> >
> > *What we will achieve by that:*
> >
> >- lower resource usage by Breeze by default (while allowing to run
> >most of the tests)
> >- easy selection of integration(s) we want to test
> >- easy way to run all tests to reproduce CI run
> >- capability of running just 'pytest' and testing (as fast as
> >possible) all the tests that are applicable in your environment (if
> you
> >want to be extra-sure everything works - for example during
> refactoring)
> >- in the future we might be able to optimise CI and run smaller set of
> >tests for postgres/mysql/sqlite 'only' cases - optimising the time
> for CI
> >builds.
> >
> >
> > If I will get a general "OK" from community for that - I can make a set
> of
> > incremental changes to breeze (as I continue working on prod image) and
> add
> > those capabilities to Breeze.
> >
> > J.
> >
> >
> >
> >
> >
> >
> > On Wed, Dec 18, 2019 at 1:10 AM Kamil Breguła  >
> > wrote:
> >
> >> It is worth adding that we currently use test marking in the project.
> For
> >> this purpose, we use the prefix "_system.py" in the file name.
> >> Unit tests:
> >>
> >>
> https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs.py
> >> System tests:
> >>
> >>
> https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs_operator_system.py
> >> Elsewhere, a special directory structure is used.
> >> Unit tests:
> >> https://github.com/apache/airflow/tree/master/tests/kubernetes
> >> Integration tests:
> >>
> https://github.com/apache/airflow/tree/master/tests/integration/kubernetes
> >>
> >> This will allow us to limit e.g. mocking in system tests.
> >> This seems to be a clearer solution because it clearly separates each
> type
> >> of test. If we add markers, they may not be noticed when making changes
> >> and
> >> review. The file name is immediately visible.
> >> Recently I dealt with such a case that system tests included mocking,
> >> which
> >> by definition did not work.
> >>
> >>
> https://github.com/apache/airflow/commit/11262c6d42c4612890a6eec71783e0a6d5b22c17
> >>
> >>
> >> On Tue, Dec 10, 2019 at 2:22 PM Jarek Potiuk 
> >> wrote:
> >>
> >> > I am all-in for markers.
> >> >
> >> > I think we should start with small set of useful markers, which should
> >> have
> >> > a useful purpose from the beginning and implement them first - to
> learn
> >> how
> >> > useful they are 

Re: Grouping tests using pytest markers

2019-12-27 Thread Jarek Potiuk
Since I am going to start working on it soon - I'd love to get some
opinions :).

J.

On Mon, Dec 23, 2019 at 11:13 AM Jarek Potiuk 
wrote:

> I have a concrete proposal that we can start with. It's not a final set of
> markers we might want to have but one that we can start with and make an
> immediate use of.
>
> I would like to adapt our tests to be immediately usable in Breeze (and
> tied with it) and follow this approach:
>
> *Proposed Breeze changes:*
>
>- `./breeze` by default will start only the main 'airflow-testing'
>image. This way no huge resource usage will be needed when breeze is
>started by default
>- './breeze --all-integrations` will start all dependent images (so we
>will be able to run all tests)
>- './breeze --integrations [kubernetes,cassandra,mongo,
>rabbitmq,redis,openldap,kerberos] - you will be able to choose which
>integrations you want to start
>- When you run `breeze --backend postgres` it will only start postgres
>not mysql and the other way round.
>
> *Proposed Pytest marks:*
>
>-
>
> pytest.mark.integrations('kubernetes'),pytest.mark.integrations('cassandra'),.
>- pytest,mark.backends("postgres"), pytest,mark.backends("mysql"),
>pytest.mark.backends("sqlite")
>
> It's very easy to add custom switches to pytest and auto-detect what is
> the default setting based on environment variables for example. We could
> follow
> https://docs.pytest.org/en/latest/example/markers.html#custom-marker-and-command-line-option-to-control-test-runs
> .
>
> *Proposed Pytest behaviour:*
>
>- `pytest` -> in Breeze will run all tests that are applicable within
>the current environment:
>   - it will only run non-marked tests by default, applicable with
>   current selected backend
>   - when (for example) you stared cassandra is added it will
>   additionally run pytest.mark.integrations('cassandra')
>- `pytest` in local environment by default will only run non-marked
>tests
>- `pytest --integrations [kubernetes, ]` will only run the
>integration tests selected (will convert the switch into the corresponding
>markers (as explained in the example above)
>- `pytest --backends [postgres| mysql | sqlite] will only run the
>specific tests that use postgres/mysql/sqlite specific tests
>
> *What we will achieve by that:*
>
>- lower resource usage by Breeze by default (while allowing to run
>most of the tests)
>- easy selection of integration(s) we want to test
>- easy way to run all tests to reproduce CI run
>- capability of running just 'pytest' and testing (as fast as
>possible) all the tests that are applicable in your environment (if you
>want to be extra-sure everything works - for example during refactoring)
>- in the future we might be able to optimise CI and run smaller set of
>tests for postgres/mysql/sqlite 'only' cases - optimising the time for CI
>builds.
>
>
> If I will get a general "OK" from community for that - I can make a set of
> incremental changes to breeze (as I continue working on prod image) and add
> those capabilities to Breeze.
>
> J.
>
>
>
>
>
>
> On Wed, Dec 18, 2019 at 1:10 AM Kamil Breguła 
> wrote:
>
>> It is worth adding that we currently use test marking in the project. For
>> this purpose, we use the prefix "_system.py" in the file name.
>> Unit tests:
>>
>> https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs.py
>> System tests:
>>
>> https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs_operator_system.py
>> Elsewhere, a special directory structure is used.
>> Unit tests:
>> https://github.com/apache/airflow/tree/master/tests/kubernetes
>> Integration tests:
>> https://github.com/apache/airflow/tree/master/tests/integration/kubernetes
>>
>> This will allow us to limit e.g. mocking in system tests.
>> This seems to be a clearer solution because it clearly separates each type
>> of test. If we add markers, they may not be noticed when making changes
>> and
>> review. The file name is immediately visible.
>> Recently I dealt with such a case that system tests included mocking,
>> which
>> by definition did not work.
>>
>> https://github.com/apache/airflow/commit/11262c6d42c4612890a6eec71783e0a6d5b22c17
>>
>>
>> On Tue, Dec 10, 2019 at 2:22 PM Jarek Potiuk 
>> wrote:
>>
>> > I am all-in for markers.
>> >
>> > I think we should start with small set of useful markers, which should
>> have
>> > a useful purpose from the beginning and implement them first - to learn
>> how
>> > useful they are (before we decide on full set of markers).
>> > Otherwise maintaining those markers will become a fruitless "chore" and
>> it
>> > might be abandoned.
>> >
>> > So my proposal is to agree the first top cases we want to handle with
>> > markers and then define/apply the markers accordingly:
>> >
>> > Those are my three top priorities (from most important to least):
>> >
>> >- 

Re: Grouping tests using pytest markers

2019-12-23 Thread Jarek Potiuk
I have a concrete proposal that we can start with. It's not a final set of
markers we might want to have but one that we can start with and make an
immediate use of.

I would like to adapt our tests to be immediately usable in Breeze (and
tied with it) and follow this approach:

*Proposed Breeze changes:*

   - `./breeze` by default will start only the main 'airflow-testing'
   image. This way no huge resource usage will be needed when breeze is
   started by default
   - './breeze --all-integrations` will start all dependent images (so we
   will be able to run all tests)
   - './breeze --integrations [kubernetes,cassandra,mongo,
   rabbitmq,redis,openldap,kerberos] - you will be able to choose which
   integrations you want to start
   - When you run `breeze --backend postgres` it will only start postgres
   not mysql and the other way round.

*Proposed Pytest marks:*

   -
   
pytest.mark.integrations('kubernetes'),pytest.mark.integrations('cassandra'),.
   - pytest,mark.backends("postgres"), pytest,mark.backends("mysql"),
   pytest.mark.backends("sqlite")

It's very easy to add custom switches to pytest and auto-detect what is the
default setting based on environment variables for example. We could follow
https://docs.pytest.org/en/latest/example/markers.html#custom-marker-and-command-line-option-to-control-test-runs
.

*Proposed Pytest behaviour:*

   - `pytest` -> in Breeze will run all tests that are applicable within
   the current environment:
  - it will only run non-marked tests by default, applicable with
  current selected backend
  - when (for example) you stared cassandra is added it will
  additionally run pytest.mark.integrations('cassandra')
   - `pytest` in local environment by default will only run non-marked tests
   - `pytest --integrations [kubernetes, ]` will only run the
   integration tests selected (will convert the switch into the corresponding
   markers (as explained in the example above)
   - `pytest --backends [postgres| mysql | sqlite] will only run the
   specific tests that use postgres/mysql/sqlite specific tests

*What we will achieve by that:*

   - lower resource usage by Breeze by default (while allowing to run most
   of the tests)
   - easy selection of integration(s) we want to test
   - easy way to run all tests to reproduce CI run
   - capability of running just 'pytest' and testing (as fast as possible)
   all the tests that are applicable in your environment (if you want to be
   extra-sure everything works - for example during refactoring)
   - in the future we might be able to optimise CI and run smaller set of
   tests for postgres/mysql/sqlite 'only' cases - optimising the time for CI
   builds.


If I will get a general "OK" from community for that - I can make a set of
incremental changes to breeze (as I continue working on prod image) and add
those capabilities to Breeze.

J.






On Wed, Dec 18, 2019 at 1:10 AM Kamil Breguła 
wrote:

> It is worth adding that we currently use test marking in the project. For
> this purpose, we use the prefix "_system.py" in the file name.
> Unit tests:
>
> https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs.py
> System tests:
>
> https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs_operator_system.py
> Elsewhere, a special directory structure is used.
> Unit tests: https://github.com/apache/airflow/tree/master/tests/kubernetes
> Integration tests:
> https://github.com/apache/airflow/tree/master/tests/integration/kubernetes
>
> This will allow us to limit e.g. mocking in system tests.
> This seems to be a clearer solution because it clearly separates each type
> of test. If we add markers, they may not be noticed when making changes and
> review. The file name is immediately visible.
> Recently I dealt with such a case that system tests included mocking, which
> by definition did not work.
>
> https://github.com/apache/airflow/commit/11262c6d42c4612890a6eec71783e0a6d5b22c17
>
>
> On Tue, Dec 10, 2019 at 2:22 PM Jarek Potiuk 
> wrote:
>
> > I am all-in for markers.
> >
> > I think we should start with small set of useful markers, which should
> have
> > a useful purpose from the beginning and implement them first - to learn
> how
> > useful they are (before we decide on full set of markers).
> > Otherwise maintaining those markers will become a fruitless "chore" and
> it
> > might be abandoned.
> >
> > So my proposal is to agree the first top cases we want to handle with
> > markers and then define/apply the markers accordingly:
> >
> > Those are my three top priorities (from most important to least):
> >
> >- Splitting out the Integration tests (and updating Breeze) so that
> you
> >choose which integration you start when you start Breeze rather than
> > start
> >them all.
> >- DB separation so that we do not repeat non-DB tests on all
> Databases.
> >- Proper separation of Kubernetes tests (They are now filtered out
> based

Re: Grouping tests using pytest markers

2019-12-17 Thread Kamil Breguła
It is worth adding that we currently use test marking in the project. For
this purpose, we use the prefix "_system.py" in the file name.
Unit tests:
https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs.py
System tests:
https://github.com/apache/airflow/blob/master/tests/operators/test_gcs_to_gcs_operator_system.py
Elsewhere, a special directory structure is used.
Unit tests: https://github.com/apache/airflow/tree/master/tests/kubernetes
Integration tests:
https://github.com/apache/airflow/tree/master/tests/integration/kubernetes

This will allow us to limit e.g. mocking in system tests.
This seems to be a clearer solution because it clearly separates each type
of test. If we add markers, they may not be noticed when making changes and
review. The file name is immediately visible.
Recently I dealt with such a case that system tests included mocking, which
by definition did not work.
https://github.com/apache/airflow/commit/11262c6d42c4612890a6eec71783e0a6d5b22c17


On Tue, Dec 10, 2019 at 2:22 PM Jarek Potiuk 
wrote:

> I am all-in for markers.
>
> I think we should start with small set of useful markers, which should have
> a useful purpose from the beginning and implement them first - to learn how
> useful they are (before we decide on full set of markers).
> Otherwise maintaining those markers will become a fruitless "chore" and it
> might be abandoned.
>
> So my proposal is to agree the first top cases we want to handle with
> markers and then define/apply the markers accordingly:
>
> Those are my three top priorities (from most important to least):
>
>- Splitting out the Integration tests (and updating Breeze) so that you
>choose which integration you start when you start Breeze rather than
> start
>them all.
>- DB separation so that we do not repeat non-DB tests on all Databases.
>- Proper separation of Kubernetes tests (They are now filtered out based
>on skipif/env variables.
>
>
> J.
>
>
> On Tue, Dec 10, 2019 at 1:32 PM Tomasz Urbaszek <
> tomasz.urbas...@polidea.com>
> wrote:
>
> > Hi everyone,
> >
> > Since we run our tests using pytest we are able to use test markers [1].
> > Using them will give
> > use some useful things:
> > - additional information of test type (ex. when used for system test)
> > - easy way to select test by types (ex. pytest -v -m "not system")
> > - way to split our test suite in more effective way (no need to run all
> > tests on 3 backends)
> >
> > I would like to discuss what "official" marks would we like to use. As a
> > base I would suggests
> > to mark tests as:
> > - system - tests that need the outside world to be successful (ex. GCP
> > system tests)
> > - db[postgres, sqlite, mysql] - tests that require database to be
> > successful, in other words,
> > tests that create some db side effects
> > - integration - tests that requires some additional resources like
> > Cassandra or Kubernetes
> >
> > All other, unmarked tests would be treated as "pure" meaning that they
> have
> > no side effects
> > (at least on database level).
> >
> > What do you think about this? Does anyone have some experience with using
> > markers in
> > such a big project?
> >
> > [1] http://doc.pytest.org/en/latest/example/markers.html
> >
> >
> > Bests,
> > Tomek Urbaszek
> >
>
>
> --
>
> Jarek Potiuk
> Polidea  | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] 
>


Re: Grouping tests using pytest markers

2019-12-10 Thread Jarek Potiuk
I am all-in for markers.

I think we should start with small set of useful markers, which should have
a useful purpose from the beginning and implement them first - to learn how
useful they are (before we decide on full set of markers).
Otherwise maintaining those markers will become a fruitless "chore" and it
might be abandoned.

So my proposal is to agree the first top cases we want to handle with
markers and then define/apply the markers accordingly:

Those are my three top priorities (from most important to least):

   - Splitting out the Integration tests (and updating Breeze) so that you
   choose which integration you start when you start Breeze rather than start
   them all.
   - DB separation so that we do not repeat non-DB tests on all Databases.
   - Proper separation of Kubernetes tests (They are now filtered out based
   on skipif/env variables.


J.


On Tue, Dec 10, 2019 at 1:32 PM Tomasz Urbaszek 
wrote:

> Hi everyone,
>
> Since we run our tests using pytest we are able to use test markers [1].
> Using them will give
> use some useful things:
> - additional information of test type (ex. when used for system test)
> - easy way to select test by types (ex. pytest -v -m "not system")
> - way to split our test suite in more effective way (no need to run all
> tests on 3 backends)
>
> I would like to discuss what "official" marks would we like to use. As a
> base I would suggests
> to mark tests as:
> - system - tests that need the outside world to be successful (ex. GCP
> system tests)
> - db[postgres, sqlite, mysql] - tests that require database to be
> successful, in other words,
> tests that create some db side effects
> - integration - tests that requires some additional resources like
> Cassandra or Kubernetes
>
> All other, unmarked tests would be treated as "pure" meaning that they have
> no side effects
> (at least on database level).
>
> What do you think about this? Does anyone have some experience with using
> markers in
> such a big project?
>
> [1] http://doc.pytest.org/en/latest/example/markers.html
>
>
> Bests,
> Tomek Urbaszek
>


-- 

Jarek Potiuk
Polidea  | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea]