On Wed, Oct 20, 2010 at 5:33 PM, David Roe <r...@math.harvard.edu> wrote: > There are a number of tickets in trac about performance regressions in > Sage. I'm sure there are far more performance regressions which we don't > know about because nobody noticed. > > As someone writing library code, it's generally not obvious that one is > about to introduce a performance regression (otherwise you'd probably not do > it). There have been instances where I've wondered whether a change I was > making might have unintended consequences for performance, but testing all > the ways in which this could manifest is laborious. > > Consequently, I've been thinking recently about how to detect performance > regressions automatically. There are really two parts to the problem: > gathering timing data on the Sage library, and analyzing that data to > determine if regression have occurred (and how serious they are).
+1. This has been lacking for a long time. > Data gathering: > > One could modify local/bin/sage-doctest to allow the option of changing each > doctest by wrapping it in a "timeit()" call. This would then generate a > timing datum for each doctest line. > * these timings vary from run to run (presumably due to differing levels of > load on the machine). I don't know how to account for this, but usually > it's a fairly small effect (on the order of 10% error). > * if you're testing against a previous version of sage, the doctest > structure will have changed because people wrote more doctests. And doctest > lines depend on each other: you define variables that are used in later > lines. So inserting a line could make timings of later lines incomparable > to the exact same line without the inserted line. We might be able to parse > the lines and check that various objects are actually the same (across > different versions of sage, so this would require either a version-invariant > hash or saving in one version, loading in the other and comparing. And you > would have to do that for each variable that occurs in the line), but that > seems to be getting too complicated... > > Many users will only have one version of sage installed. And even with > multiple versions, you need somewhere to PUT the data in order to compare to > later. > > One way of handling these problems is to create a relational database to put > timings in. This could also be interesting for benchmarketing purposes: we > could have timings on various machines, and we highlight performance > improvements, in addition to watching for performance regressions. > > So, here's a first draft for a database schema to put timing data into. > I've included a description of each table, along with a description of > columns I thought were non-obvious. I'm definitely interested in > suggestsion for improving this schema. > > Table: Hosts > # computer information; including identifying data to determine when running > on same host > col: id > col: identifying_data # some way to uniquely identify the computer on which > a test is being run. Presumably the output of some unix function, but I > don't know what. > > Table: Sage_version > # a table giving each existing version of Sage an id > # the ids for official sage releases should be consistent across databases; > # users can also create their own temporary versions which use a different > block of id numbers. > # when a new version is created, the Files, Doctest_blocks and Doctest_lines > tables are updated > col: id > col: version_name # string defining the version > col: prev_version_id # versions should be totally ordered. Since we want to > reserve some id space for official sage versions and other space for users, > this can't be done by just the numerical id. > > Table: Files > # a table for all the files in sage that have doctests > col: id > col: filename # e.g. "sage/rings/integer.pyx" > # we might also want some other data, like when the file was added (or > deleted), whether it's a python vs cython file…. > > Table: Doctest_blocks > # a table for the triple-quote-encased strings that hold doctests > # theoretically, these could be smaller groupings within such a larger > block. But while it's generally true that doctest blocks can be executed > independently of each other, it's hard to identify subsets of the blocks > that are independent of each other. > col: id > col: name # usually the name of the function, class or file to which this > doctest belongs > col: version_id # sage version in which this block lies > col: file_id # sage file in which this block lies > col: prev_id # doctest block id for analogous block in previous version of > sage. -1 indicates nonexistent > > Table: Doctest_lines > # a table for lines beginning with "sage:" that are executed by the doctest > framework. > # lines that occur many times in the Sage library (e.g. "R.<x> = QQ[]") have > col: id > col: code # e.g. "R.<x> = QQ[]" > # could theoretically support storing equivalent lines of code, such as > "R.<x> = PolynomialRing(QQ)" > > Table: Block_line_incidences > # a table to record the many-to-many relation between doctest_blocks and > doctest_lines > col: id > col: block_id > col: line_id > col: line_number # which line in the block this is, e.g. 1 (or 0?) if this > is the first doctest line in the block > > Table: Regression_runs > # a table to record times (possibly on different machines) where data was > entered into this database from a regression-test-run > col: id > col: host_id # the computer on which this was run > col: version_id # the sage version > col: start_time # the time this run was started > col: duration # how long this run lasted (would end_time be more useful?) > # doesn't necessarily have to be a run through the whole library; could only > include a subset of the files. > # could include other info, like loading, though I don't know how to measure > this… > > Table: Timing_data > # A table to record the timing data we're actually interested in > col: id > col: bli_id # the id for a block_line_incidence > col: run_id # the regression run that this measurement was a part of > col: cpu_time # the amount of time this took to run, as measured through > timeit > # presumably we require that the two version_ids you can get tracing up > through the bli and the run are the same version. > > > There are also questions of how this would be distributed. Presumably the > data part wouldn't come standard with sage. Maybe an optional spkg? Of > course, you're mainly interested in comparing DIFFERENT versions of sage on > the SAME computer, which doesn't really fit with how we normally distribute > code and data. > > > Analysis: > Once we have the database set up and have gathered some data, you can do all > kinds of things with it. I'm most interested in how to find speed > regressions, and I can certainly imagine writing code to do so. You have > data from a previous version of sage (or your own, before you applied some > patch(es)) and you run a regression test with "sage -rt" or something; this > generates the same kind of timing data and then you look for slowdowns, > either absolute or relative to the average ratio of your current run to the > previous run for a given line of code. It can then print out a sorted list > of lines with slowdowns, or ones above a certain threshold, or... > > I'd like to hear more ideas on how to improve all of this. timit() might be a bit extreem--I think it performs a minimum of 5 loops. I don't look forward to 15-cpu-hour test runs, ideally we could just collection (noisier) timing information by default. Just have it spit out a sqlite database object (perhaps in a default location if none is specified). I would say you don't really need a complicated schema. Perhaps a single table would suffice, with columns OS info/id Sage version hg changeset (useful to tell if it's vanilla) file doctest identifier (fully qualified function name?) doctest content cpu time wall time run id (unique per run) There'd be a lot of redundancy in some of the columns, but that's OK--it'll just compress nice. Perhaps it could even be a comma/tab separated text file, to make things even simpler. A separate tool could be used to compare runs, flagging any egregious (or favorable) differences. Matching could be done in several ways--on the text content, on the method name, etc. - Robert -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org