On Sun, Jun 14, 2009 at 12:39 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Sat, Jun 13, 2009 at 2:48 PM, Tom Lane<t...@sss.pgh.pa.us> wrote: > > Greg Stark <greg.st...@enterprisedb.com> writes: > >> I'm not sure about that. It seems like race conditions with autovacuum > >> are a real potential bug that it would be nice to be testing for. > > > > It's not a bug; it's a limitation of our testing framework that it sees > > this as a failure. Serious testing for autovac race conditions would > > indeed be interesting, but you're never going to get anything meaningful > > in that direction out of the current framework. > > The elephant in the room here may be moving to some more > flexible/powerful testing framework, but the difficulty will almost > certainly be in agreeing what it should look like. The actual writing > of said test framework will take some work too, but to some degree > that's a SMOP. > > This tuple-ordering issue seems to be one that comes up over and over > again, but in the short term, making it a TEMP table seems like a > reasonable fix. > I am forwarding a mail perl script and a pair of sample files that I developed about an year ago. The forwarded mail text explains what the script is trying to do. A line beginning with '?' in the expected file is treated specially. If a line begins with '?' then the rest of the line is treated as a regular expression which will be used to match the corresponding line from the actual output. If '?' is immediately followed by the word 'unordered' all the lines till a line containing '?/unordered' are buffered and compared against corresponding lines from the result file ignoring the order of the result lines. Although we at EnterpriseDB have resolved the issues by alternate files etc., and do not use this script, I think it might be useful for community regression tests. Best regards, ---------- Forwarded message ---------- From: Gurjeet Singh <gurjeet.si...@enterprisedb.com> Date: Fri, Aug 8, 2008 at 1:45 AM Subject: neurodiff: a new diff utility for our regression test suites Hi All, PFA a perl script that implements a new kind of comparison, that might help us in situations like we have encountered with differeing plan costs in the hints patch recently. This script implements two new kinds of comparisons: i) Regular Expression (RE) based comparison, and ii) Comparison of unordered group of lines. The input for this script, just like regular diff, are two files, one expected output and one the actual output. The lines in the expected output file which are expected to have any kind of variability should start with a '?' character followed by an RE that line should match. For example, if we wish to compare a line of EXPLAIN output, that has the cost component too, then it might look like: ? Index Scan using accounts_i1 on accounts \(cost=\d+\.\d+\.\.\d+\.\d+ rows=\d+ width=\d+\) The above RE would help us match any line that matches the pattern, such as: Index Scan using accounts_i1 on accounts (cost=0.00..8.28 rows=1 width=106) or Index Scan using accounts_i1 on accounts (cost=1000.9999..2000.20008 rows=10000 width=1000) Apart from this, the SQL standard does not guarantee any order of results unless the query has an explicit ORDER BY clause. We often encounter cases in our result files where the output differs from the expected only in the order of the result. To bypass this effect, and to keep the 'diff' quiet, I have seen people invariably add an ORDER BY clause to the query, and modify the expected file accordingly. There is a remote possibility of the ORDER BY clause masking an issue/bug that would have otherwise shown up in the diffs or might have caused the crash. Using this script we can put special markers in the expected output, that denote the boundaries of a set of lines, that are expected to be produced in an undefined order. The script would not complain unless there's an actual missing or extra line in the output. Suppose that we have the following result-set to compare: 4 | JACK 5 | CATHY 2 | SCOTT 1 | KING 3 | MILLER The expected file would look like this: ?unordered 1 | KING 2 | SCOTT ?\d \| MILLER 4 | JACK 5 | CATHY ?/unordered This expected file will succeed for both the following variations of the result-sets too: 5 | CATHY 4 | JACK 3 | MILLER 2 | SCOTT 1 | KING or 1 | KING 4 | JACK 3 | MILLER 2 | SCOTT 5 | CATHY Also, as shown in the above example, the RE based matching works for the lines within the 'unordered' set too. The beauty of this approach for testing pattern matches and unordered results is that we don't have to modify the test cases in any way, just need to make adjustments in the expected output files. I am no perl guru, so I definitely see a lot of performance/semantic improvements possible (perl gurus, take a stab); and maybe thats the reason the script looks more like a C program than a whacky perl script full of ~!$^ and whatnot. This script cannot identify hunks, like 'diff' can do; which means that even if a single line is missing, or if there an extra line somewhere in the result file, all the rest of the lines from both the files will show up in the diff. But I think we do not need the hunk identification as much as we need the features this script provides. Some time ago I had attempted to implement these very features in diffutils (diff et al.), but gave up too early! And then Dave's mention two days ago about trying to remove MinGW dependencies and moving to perl prompted me to start afresh in perl, and it was amazingly simple in perl (but was time consuming as I am a complete newbie)! Best regards, -- Lets call it Postgres EnterpriseDB http://www.enterprisedb.com gurjeet[.sin...@enterprisedb.com singh.gurj...@{ gmail | hotmail | indiatimes | yahoo }.com Mail sent from my BlackLaptop device
neurodiff.pl
Description: Binary data
expected.out
Description: Binary data
result.out
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers