Re: D19 Comments and Diff
- Original Message - > From: "Tim Flink" > To: qa-devel@lists.fedoraproject.org > Sent: Wednesday, March 5, 2014 8:23:31 PM > Subject: Re: D19 Comments and Diff > > I'm generally of the mind that folks shouldn't have to dive into > docstrings on tests in order to understand what is being tested. It is > unavoidable in some cases where you have complex tests but that's > usually a test or code smell which should at least be acknowledged. Sure, I do agree. On the other hand, I believe that the developer running the tests should have at least an overall idea of what the "tested" code does (since he probably made some changes, and that triggered the need for running unit tests). I do not know why, but sometimes it seems like people (and I'm refering to my previous employments here) tend to believe (and I share this belief to some extent) that the "production" code can be complex (and by that I do not mean smelly), and the people reading/maintaining it will be able to understand it (with the possible help of comments). But at the same time, the (unit)tests must be written in a way, that first year high school student must be able to instantly understand them. Maybe it is a residue from the usual corporate "testing is done by dummies, programming is done by the geniuses" approach, I don't know. But I kind of tend to treat the tests as a helper tool for _the developer_. > One of my counter-points here is that if the tests are so trivial, do > we really need to have them? After a certain point, we can add tests > but if they aren't incredibly meaningful, we'd just be adding > maintenance burden and not gaining much from the additional coverage. Sure, but the way I write tests is bottom up - I start from the simple ones, and traverse the way up into the "more complex" tests. I'm not saying that this is good/best approach, it just makes sense for me to know, that the basics are coverede before diving into the more "tricky" stuff. > Overreact much? :-P Yup, I sometimes tend to :D But once you see my head-desks you'll understand :D > I may have gone a little too far and not spent enough time on the > tests. I agree that some of the test names could be better and that > there's not a whole lot of benefit to being rigid about "one assert per > test" regardless of whether or not it's generally accepted as a good > thing. I know that the "One assert to rule them all" (hyperbole here) is usually considered _the_ approach, but all the time I see the "Have one assert per tests" guideline (which tends to be interpreted as _the rule_), there is this other guideline saying "Test one logical concept per test". And this is what I tend to do, and what (IMHO) Kamil did in his de-coupled patch. So not all tests with more than one assert are necessarily test smell. And finding the right balance between the two guidelines is IMHO the goal we should aim for. So yes, having method names longer that the actual test code is something I consider... Not really that great :) But I understand that you wanted to show Kamil (and the rest of us) what can be done, and what the general guidelines behind the unit testing are, so I'm not trying to disregard the overall benefit. > I also want to avoid increasing the maintenance burden of having a > bunch of tests that look for things which don't really need to be > tested (setting data members, checking default values etc.) I agree, there is stuff that kind of can be taken for granted. J. ___ qa-devel mailing list qa-devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/qa-devel
Possible QA Devel Projects for GSoC 2014
Fedora has been accepted as a mentoring org for GSoC 2014 and I'm planning to sign up as a mentor again this year. I'm trying to think of good projects that we could put on the list of suggestions of things that we'd like students to work on and figured it was a good topic for wider discussion. I'd like to avoid any blockerbugs projects this year so that we can focus on taskotron and keeping forward momentum. I've made a quick list of the possible projects that I can think of. Please comment on any of them that you think would benefit from having a dedicated intern over the summer or add to the list if you can think of other projects. Ideally, the projects would be self-contained enough for the student to demonstrate their progress over the summer but not so isolated that they wouldn't be interacting with the community. Projects should be far enough out that we wouldn't be critically blocked on them but close enough that the effects of their work are visible before the end of GSoC. Tim Graphical Installation Testing Continue the work that Jan started with his thesis or look into integrating something like openqa. The emphasis here is on the graphical interface since ks-based installation testing could be covered by stuff already written for beaker Beaker Integration This is on our roadmap and is certainly something that would be useful. It would require a bit of infrastructure work and likely the cooperation of the beaker devs but seems like it could be a good project even if it isn't the most exciting thing ever. On the other hand, this could end up being rather critical and may not be something that we want to mostly hand off to a student. Gnome Integration Test Support An over-simplification of this would be to say "take the stuff that's run as part of gnome continuous [1] and run it on fedora packages". The goal would be to have gnome's integration test suites running with any new gnome builds. [1] https://wiki.gnome.org/action/show/Projects/GnomeContinuous Disposable Client Support This is another of the big features that we'll be implementing before too long. It's one of the reasons that we made the shift from AutoQA to taskotron and is blocking features which folks say they want to see (user-submitted tasks, mostly). This would involve some investigation into whether OpenStack would be practical, if there is another provisioning system we could use or if we'll be forced to roll our own (which I'd rather avoid). There should be some tie-in with the graphical installation support and possibly the gnome integration tests. RPM-OSTree Support/Integration I haven't used rpm-ostree enough yet to figure out how good of a fit it'd be with taskotron but from the description of the project and the discussions I've had with cwalters, it sounds like it could be a good fit as part of our provisioning system for disposable clients. If we're serious about proposing this as a GSoC project, we should probably explore it a bit more to be certain that we'd want it now but I figured it was worth putting on the list. [2] https://github.com/cgwalters/rpm-ostree System for apparent results storage and modification There has to be a better title for this but it would be one of the last major steps in enabling bodhi/koji to block builds/updates on check failures. The idea would be to provide an interface which can decide whether a build/update is OK based on what checks were passed/failed. It would have a mechanism for manual overrides and algorithmic overrides (ie, we know that foo has problem X and are working on it, ignore failures for now) so that we don't upset packagers more than we need to. When Josef and I last talked about this, we weren't sure that putting this functionality into our results storage mechanism was wise. It's a different concern that has the potential to make a mess out of the results storage. signature.asc Description: PGP signature ___ qa-devel mailing list qa-devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/qa-devel
Default invocation of pytest for qadevel projects
For the qadevel projects I'm aware of, we've been using pytest for our unit/functional test runner. As part of the shared configuration, tests are split up into two categories, unit and functional. Unit tests are fast, do not touch the network, database or filesystem (there are some exceptions to that last part, though). Functional tests tend to be more integration tests that can set up a database or do other actions which fall outside of the previous definition of "unit test". When you run pytest without any extra args, only the unit tests are run. The '--functional' arg is needed to enable collection and execution of the functional tests. Kamil recently made a request [1] to do one of two things: [1] https://phab.qadevel.cloud.fedoraproject.org/T89 1. Change the default such that functional tests are collected and exclude them from collection using a '--unit' arg 2. Change the functional arg from '--functional' to something shorter, like '--func' * note that there are restrictions on which args we can use. For example, '-f' is not allowed as single letter args are reserved for pytest itself As stated in T89, I don't have a strong opinion on this as long as it's possible to exclude the functional tests from collection and we make the same change across all of our active projects. However, I wanted to put this up for a wider discussion before changing things. Any other thoughts on this proposed change? Tim signature.asc Description: PGP signature ___ qa-devel mailing list qa-devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/qa-devel
Re: D19 Comments and Diff
On Wed, 5 Mar 2014 12:23:31 -0700 Tim Flink wrote: > > That's true, but if we have zero-failed-test policy, why does it > > matter? If you patch introduced a problem, you need to fix it, and > > you might discover another one once you do. Yes, it might be a > > sequence of steps rather than seeing everything at once, but at the > > benefit of test readability (items logically grouped together, > > rather than one assertion per method). Also, I'm testing a part of > > a single function in one method, it's not like it can have large > > consequences. I understand why splitting larger pieces to smaller > > pieces is good, but I don't understand why splitting it into one > > line per method is good. All of those tests are trivial - simple > > functionality of simple methods. > > One of my counter-points here is that if the tests are so trivial, do > we really need to have them? After a certain point, we can add tests > but if they aren't incredibly meaningful, we'd just be adding > maintenance burden and not gaining much from the additional coverage. > > > They have a dozen lines, a few asserts. An error in the > > first line can hide an error or two in the remaining eleven lines, > > but you'll discover that soon enough, once you fix the code and run > > the tests again. Is that a problem? > > In this particular case, not a huge problem - the code under test is > relatively trivial. That being said, I don't really want to start off > with tests that Complete thoughts do tend to help ... That being said, I don't really want to start off with tests that are putting too many asserts into the one test. Some combinations are fine but I'd prefer erring on the side of separating things out a bit too much for now rather than risk setting an example of large, monolithic tests. Tim signature.asc Description: PGP signature ___ qa-devel mailing list qa-devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/qa-devel
Re: D19 Comments and Diff
For some background, I sent this email to Kamil yesterday with a diff containing suggestions on splitting up the unit tests for CheckDetail and python_utils. It didn't seem like a big deal to me at the time, so I just sent it to him. Given the level and type of discussion that it seems to be started, I agree with Kamil that this needs to be discussed with a wider audience and I'm replying to the list. The diff has been slightly tweaked with a comment and a test method name change. I've left some commented out tests in the diff to show where I think tests could be removed without getting rid of meaningful test coverage. I wasn't very clear on this in my first email, but I meant the diff partially as an example and it still needs a bit of work. I'm planning to update the diff and add more comments to the review. The review that spawned this conversation is up at: https://phab.qadevel.cloud.fedoraproject.org/D19 My comments are inline. Tim On Wed, 5 Mar 2014 07:21:09 -0500 (EST) Kamil Paral wrote: > > Kamil, > > > > Instead of putting comments directly into D19, I figured it would be > > easier to put them in the form of a diff. > > It might be better to discuss this publicly, so that others can also > voice their opinions. You can attach files to Phab using Files, and > then you can somehow easily reference them inside the comment, I > guess. That might be a good way to upload a related diff. As stated above, agreed. > > I realize this is going back on what I said last week, but I think > > that separating out the tests even farther would be for the best > > for the following reasons: > > > > 1. I still had a bit of trouble understanding what exactly all the > > tests were supposed to be doing. I think that separating them out > > and making the names more verbose will aid in maintainability when > > other people end up reading the tests > > Can you specify which lines exactly were unclear to you? What if I > provided docstrings for such methods? Because I think that using a > docstring for explanation is much better way than trying to squeeze > it into a method name (which are then very confusing and complex, > like test_init_details_set_via_params_output() or > test_update_result_no_override_higher_priority_fail_pass() - that's > not readable at all; actually it looks like Java!:-) ). I think > people would spend more time decrypting the method names than reading > the docstring. Yeah, some of the method names are a bit long and obtuse. I could have chosen better names. I fixed the worst one in the new diff, though :) I'm generally of the mind that folks shouldn't have to dive into docstrings on tests in order to understand what is being tested. It is unavoidable in some cases where you have complex tests but that's usually a test or code smell which should at least be acknowledged. > > > > 2. By splitting up the tests, almost all the assertions are always > > run. When they're grouped together, the test fails @ the first > > failed assertion and not all assertions are run. > > That's true, but if we have zero-failed-test policy, why does it > matter? If you patch introduced a problem, you need to fix it, and > you might discover another one once you do. Yes, it might be a > sequence of steps rather than seeing everything at once, but at the > benefit of test readability (items logically grouped together, rather > than one assertion per method). Also, I'm testing a part of a single > function in one method, it's not like it can have large consequences. > I understand why splitting larger pieces to smaller pieces is good, > but I don't understand why splitting it into one line per method is > good. All of those tests are trivial - simple functionality of simple > methods. One of my counter-points here is that if the tests are so trivial, do we really need to have them? After a certain point, we can add tests but if they aren't incredibly meaningful, we'd just be adding maintenance burden and not gaining much from the additional coverage. > They have a dozen lines, a few asserts. An error in the > first line can hide an error or two in the remaining eleven lines, > but you'll discover that soon enough, once you fix the code and run > the tests again. Is that a problem? In this particular case, not a huge problem - the code under test is relatively trivial. That being said, I don't really want to start off with tests that > > I've gone through and made comments in the diff in addition to > > splitting up most of the tests. > > > > As I'm writing this, I had a couple of problems with the fixtures > > when using my system py.test which go away with py.test 2.5.2 in a > > virtualenv. This may be some issue with my local env but I wanted to > > send off the diff instead of sitting on it another day. I can take > > another look at the tests if they're causing broader problems. > > The fixtures are interesting, thanks. I don't really understand why > it needs 4 methods instead of just 2, bu