Re: D19 Comments and Diff

2014-03-05 Thread Josef Skladanka

- Original Message -
> From: "Tim Flink" 
> To: qa-devel@lists.fedoraproject.org
> Sent: Wednesday, March 5, 2014 8:23:31 PM
> Subject: Re: D19 Comments and Diff
> 
> I'm generally of the mind that folks shouldn't have to dive into
> docstrings on tests in order to understand what is being tested. It is
> unavoidable in some cases where you have complex tests but that's
> usually a test or code smell which should at least be acknowledged.

Sure, I do agree. On the other hand, I believe that the developer running the 
tests should have at least an overall idea of what the "tested" code does 
(since he probably made some changes, and that triggered the need for running 
unit tests). I do not know why, but sometimes it seems like people (and I'm 
refering to my previous employments here) tend to believe (and I share this 
belief to some extent) that the "production" code can be complex (and by that I 
do not mean smelly), and the people reading/maintaining it will be able to 
understand it (with the possible help of comments). But at the same time, the 
(unit)tests must be written in a way, that first year high school student must 
be able to instantly understand them. Maybe it is a residue from the usual 
corporate "testing is done by dummies, programming is done by the geniuses" 
approach, I don't know. But I kind of tend to treat the tests as a helper tool 
for _the developer_.

> One of my counter-points here is that if the tests are so trivial, do
> we really need to have them? After a certain point, we can add tests
> but if they aren't incredibly meaningful, we'd just be adding
> maintenance burden and not gaining much from the additional coverage.

Sure, but the way I write tests is bottom up - I start from the simple ones, 
and traverse the way up into the "more complex" tests. I'm not saying that this 
is good/best approach, it just makes sense for me to know, that the basics are 
coverede before diving into the more "tricky" stuff.
 
> Overreact much? :-P

Yup, I sometimes tend to :D But once you see my head-desks you'll understand :D
 
> I may have gone a little too far and not spent enough time on the
> tests. I agree that some of the test names could be better and that
> there's not a whole lot of benefit to being rigid about "one assert per
> test" regardless of whether or not it's generally accepted as a good
> thing.

I know that the "One assert to rule them all" (hyperbole here) is usually 
considered _the_ approach, but all the time I see the "Have one assert per 
tests" guideline (which tends to be interpreted as _the rule_), there is this 
other guideline saying "Test one logical concept per test". And this is what I 
tend to do, and what (IMHO) Kamil did in his de-coupled patch. So not all tests 
with more than one assert are necessarily test smell.

And finding the right balance between the two guidelines is IMHO the goal we 
should aim for. So yes, having method names longer that the actual test code is 
something I consider... Not really that great :) But I understand that you 
wanted to show Kamil (and the rest of us) what can be done, and what the 
general guidelines behind the unit testing are, so I'm not trying to disregard 
the overall benefit.

> I also want to avoid increasing the maintenance burden of having a
> bunch of tests that look for things which don't really need to be
> tested (setting data members, checking default values etc.)

I agree, there is stuff that kind of can be taken for granted.

J.
___
qa-devel mailing list
qa-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/qa-devel


Possible QA Devel Projects for GSoC 2014

2014-03-05 Thread Tim Flink
Fedora has been accepted as a mentoring org for GSoC 2014 and I'm
planning to sign up as a mentor again this year. I'm trying to think of
good projects that we could put on the list of suggestions of things
that we'd like students to work on and figured it was a good topic for
wider discussion.

I'd like to avoid any blockerbugs projects this year so that we can
focus on taskotron and keeping forward momentum. I've made a quick list
of the possible projects that I can think of. Please comment on any of
them that you think would benefit from having a dedicated intern over
the summer or add to the list if you can think of other projects.

Ideally, the projects would be self-contained enough for the student to
demonstrate their progress over the summer but not so isolated that
they wouldn't be interacting with the community. Projects should be far
enough out that we wouldn't be critically blocked on them but close
enough that the effects of their work are visible before the end of
GSoC.

Tim



Graphical Installation Testing

Continue the work that Jan started with his thesis or look into
integrating something like openqa. The emphasis here is on the
graphical interface since ks-based installation testing could be
covered by stuff already written for beaker



Beaker Integration

This is on our roadmap and is certainly something that would be useful.
It would require a bit of infrastructure work and likely the
cooperation of the beaker devs but seems like it could be a good
project even if it isn't the most exciting thing ever.

On the other hand, this could end up being rather critical and may not
be something that we want to mostly hand off to a student.



Gnome Integration Test Support


An over-simplification of this would be to say "take the stuff that's
run as part of gnome continuous [1] and run it on fedora packages". The
goal would be to have gnome's integration test suites running with any
new gnome builds.

[1] https://wiki.gnome.org/action/show/Projects/GnomeContinuous



Disposable Client Support


This is another of the big features that we'll be implementing before
too long. It's one of the reasons that we made the shift from AutoQA to
taskotron and is blocking features which folks say they want to see
(user-submitted tasks, mostly).

This would involve some investigation into whether OpenStack would be
practical, if there is another provisioning system we could use or if
we'll be forced to roll our own (which I'd rather avoid). There should
be some tie-in with the graphical installation support and possibly the
gnome integration tests.



RPM-OSTree Support/Integration


I haven't used rpm-ostree enough yet to figure out how good of a fit
it'd be with taskotron but from the description of the project and the
discussions I've had with cwalters, it sounds like it could be a good
fit as part of our provisioning system for disposable clients.

If we're serious about proposing this as a GSoC project, we should
probably explore it a bit more to be certain that we'd want it now but
I figured it was worth putting on the list.

[2] https://github.com/cgwalters/rpm-ostree



System for apparent results storage and modification


There has to be a better title for this but it would be one of the last
major steps in enabling bodhi/koji to block builds/updates on check
failures. The idea would be to provide an interface which can decide
whether a build/update is OK based on what checks were passed/failed.
It would have a mechanism for manual overrides and algorithmic
overrides (ie, we know that foo has problem X and are working on it,
ignore failures for now) so that we don't upset packagers more than we
need to.

When Josef and I last talked about this, we weren't sure that putting
this functionality into our results storage mechanism was wise. It's a
different concern that has the potential to make a mess out of the
results storage.


signature.asc
Description: PGP signature
___
qa-devel mailing list
qa-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/qa-devel


Default invocation of pytest for qadevel projects

2014-03-05 Thread Tim Flink
For the qadevel projects I'm aware of, we've been using pytest for our
unit/functional test runner.

As part of the shared configuration, tests are split up into two
categories, unit and functional. Unit tests are fast, do not touch the
network, database or filesystem (there are some exceptions to that last
part, though). Functional tests tend to be more integration tests that
can set up a database or do other actions which fall outside of the
previous definition of "unit test".

When you run pytest without any extra args, only the unit tests are
run. The '--functional' arg is needed to enable collection and
execution of the functional tests.

Kamil recently made a request [1] to do one of two things:

[1] https://phab.qadevel.cloud.fedoraproject.org/T89

1. Change the default such that functional tests are collected and
exclude them from collection using a '--unit' arg

2. Change the functional arg from '--functional' to something shorter,
like '--func'

  * note that there are restrictions on which args we can use. For
example, '-f' is not allowed as single letter args are reserved for
pytest itself

As stated in T89, I don't have a strong opinion on this as long as it's
possible to exclude the functional tests from collection and we make
the same change across all of our active projects. However, I wanted to
put this up for a wider discussion before changing things.

Any other thoughts on this proposed change?

Tim


signature.asc
Description: PGP signature
___
qa-devel mailing list
qa-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/qa-devel


Re: D19 Comments and Diff

2014-03-05 Thread Tim Flink
On Wed, 5 Mar 2014 12:23:31 -0700
Tim Flink  wrote:


 
> > That's true, but if we have zero-failed-test policy, why does it
> > matter? If you patch introduced a problem, you need to fix it, and
> > you might discover another one once you do. Yes, it might be a
> > sequence of steps rather than seeing everything at once, but at the
> > benefit of test readability (items logically grouped together,
> > rather than one assertion per method). Also, I'm testing a part of
> > a single function in one method, it's not like it can have large
> > consequences. I understand why splitting larger pieces to smaller
> > pieces is good, but I don't understand why splitting it into one
> > line per method is good. All of those tests are trivial - simple
> > functionality of simple methods.
> 
> One of my counter-points here is that if the tests are so trivial, do
> we really need to have them? After a certain point, we can add tests
> but if they aren't incredibly meaningful, we'd just be adding
> maintenance burden and not gaining much from the additional coverage.
> 
> > They have a dozen lines, a few asserts. An error in the
> > first line can hide an error or two in the remaining eleven lines,
> > but you'll discover that soon enough, once you fix the code and run
> > the tests again. Is that a problem?
> 
> In this particular case, not a huge problem - the code under test is
> relatively trivial. That being said, I don't really want to start off
> with tests that

Complete thoughts do tend to help ...

That being said, I don't really want to start off with tests that are
putting too many asserts into the one test. Some combinations are fine
but I'd prefer erring on the side of separating things out a bit too
much for now rather than risk setting an example of large, monolithic
tests.

Tim


signature.asc
Description: PGP signature
___
qa-devel mailing list
qa-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/qa-devel


Re: D19 Comments and Diff

2014-03-05 Thread Tim Flink
For some background, I sent this email to Kamil yesterday with a diff
containing suggestions on splitting up the unit tests for CheckDetail
and python_utils. It didn't seem like a big deal to me at the time, so
I just sent it to him. Given the level and type of discussion that it
seems to be started, I agree with Kamil that this needs to be discussed
with a wider audience and I'm replying to the list.

The diff has been slightly tweaked with a comment and a test method
name change. I've left some commented out tests in the diff to show
where I think tests could be removed without getting rid of meaningful
test coverage.

I wasn't very clear on this in my first email, but I meant the diff
partially as an example and it still needs a bit of work. I'm planning
to update the diff and add more comments to the review.

The review that spawned this conversation is up at:
https://phab.qadevel.cloud.fedoraproject.org/D19

My comments are inline.

Tim

On Wed, 5 Mar 2014 07:21:09 -0500 (EST)
Kamil Paral  wrote:

> > Kamil,
> > 
> > Instead of putting comments directly into D19, I figured it would be
> > easier to put them in the form of a diff.
> 
> It might be better to discuss this publicly, so that others can also
> voice their opinions. You can attach files to Phab using Files, and
> then you can somehow easily reference them inside the comment, I
> guess. That might be a good way to upload a related diff.

As stated above, agreed.

> > I realize this is going back on what I said last week, but I think
> > that separating out the tests even farther would be for the best
> > for the following reasons:
> > 
> > 1. I still had a bit of trouble understanding what exactly all the
> > tests were supposed to be doing. I think that separating them out
> > and making the names more verbose will aid in maintainability when
> > other people end up reading the tests
> 
> Can you specify which lines exactly were unclear to you? What if I
> provided docstrings for such methods? Because I think that using a
> docstring for explanation is much better way than trying to squeeze
> it into a method name (which are then very confusing and complex,
> like test_init_details_set_via_params_output() or
> test_update_result_no_override_higher_priority_fail_pass() - that's
> not readable at all; actually it looks like Java!:-) ). I think
> people would spend more time decrypting the method names than reading
> the docstring.

Yeah, some of the method names are a bit long and obtuse. I could have
chosen better names. I fixed the worst one in the new diff, though :)

I'm generally of the mind that folks shouldn't have to dive into
docstrings on tests in order to understand what is being tested. It is
unavoidable in some cases where you have complex tests but that's
usually a test or code smell which should at least be acknowledged.

> > 
> > 2. By splitting up the tests, almost all the assertions are always
> > run. When they're grouped together, the test fails @ the first
> > failed assertion and not all assertions are run.
> 
> That's true, but if we have zero-failed-test policy, why does it
> matter? If you patch introduced a problem, you need to fix it, and
> you might discover another one once you do. Yes, it might be a
> sequence of steps rather than seeing everything at once, but at the
> benefit of test readability (items logically grouped together, rather
> than one assertion per method). Also, I'm testing a part of a single
> function in one method, it's not like it can have large consequences.
> I understand why splitting larger pieces to smaller pieces is good,
> but I don't understand why splitting it into one line per method is
> good. All of those tests are trivial - simple functionality of simple
> methods.

One of my counter-points here is that if the tests are so trivial, do
we really need to have them? After a certain point, we can add tests
but if they aren't incredibly meaningful, we'd just be adding
maintenance burden and not gaining much from the additional coverage.

> They have a dozen lines, a few asserts. An error in the
> first line can hide an error or two in the remaining eleven lines,
> but you'll discover that soon enough, once you fix the code and run
> the tests again. Is that a problem?

In this particular case, not a huge problem - the code under test is
relatively trivial. That being said, I don't really want to start off
with tests that

> > I've gone through and made comments in the diff in addition to
> > splitting up most of the tests.
> > 
> > As I'm writing this, I had a couple of problems with the fixtures
> > when using my system py.test which go away with py.test 2.5.2 in a
> > virtualenv. This may be some issue with my local env but I wanted to
> > send off the diff instead of sitting on it another day. I can take
> > another look at the tests if they're causing broader problems.
> 
> The fixtures are interesting, thanks. I don't really understand why
> it needs 4 methods instead of just 2, bu