We do have the capability of running gcov on a bunch of systems and merging the results and marking the source that is never tested From http://www.mcs.anl.gov/petsc/developers/index.html there is the sentence The coverage(what lines of source code are tested in the nightly builds) can be found@http:/www.mcs.anl.gov/petsc/petsc-dev/index_gcov1.html sadly the link is broken.
Barry On Jan 24, 2013, at 10:22 AM, Karl Rupp <rupp at mcs.anl.gov> wrote: > Hi, > > On 01/24/2013 09:47 AM, Jed Brown wrote: >> >> On Thu, Jan 24, 2013 at 9:39 AM, Karl Rupp <rupp at mcs.anl.gov >> <mailto:rupp at mcs.anl.gov>> wrote: >> >> Testing for the same number of iterations is - as you mentioned - a >> terrible metric. I see this regularly on GPUs, where rounding modes >> differ slightly from CPUs. Running a fixed (low) number of >> iterations is certainly the better choice here, provided that the >> systems we use for the tests are neither too ill-conditioned nor too >> well-behaved so that we can eventually reuse the tests for some >> preconditioners. >> >> >> That's something that certainly makes sense for tests of functionality, >> but not for examples/tutorials that new users should encounter, lest >> they get the impression that they should use such options. > > I consider it good practice to keep tests and tutorials separate, just as it > is suggested by our folder hierarchy (even though there may be no strict > adherence to it right now). Guiding the user towards using a certain > functionality is so fundamentally different from testing a certain > functionality for various inputs and/or corner cases. > > For example, in ViennaCL I have a test that checks the operation > A = B * C > for dense matrices A, B, and C. In a tutorial, I only show about three uses > of this functionality. In the test, however, all combinations of transpose, > submatrices, memory-layouts, etc. are tested, leading to about 200 variations > of this operation. I can't think of any reasonable way of merging the two - > so I simply optimize them for their particular purpose. So, we can only check > the tutorials on whether they execute properly (i.e. input files found, > convergence achieved at all, etc.), but they are inherently unsuitable for > thorough testing. > > >> Do you have much experience with code coverage tools? It would be very >> useful if we could automatically identify which tests were serving no >> useful purpose. The amount of time taken by make alltests is currently >> unreasonable, and though parallel testing will help, I suspect there are >> also many tests that could be removed (and time-consuming tests that >> could be made much faster without affecting their usefulness). > > Similar to what Matt said, I tried gcov some long time ago when I was a > programming-greenhorn. I wasn't satisfied with the results I got at that > time, so I designed 'implementation-aware' tests instead which gave > satisfactory results. Note to myself: Reevaluate gconv. > > Best regards, > Karli >
