On Thu, Apr 5, 2012 at 7:05 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Thu, Apr 5, 2012 at 9:29 AM, Greg Stark <st...@mit.edu> wrote: >> On Thu, Apr 5, 2012 at 2:24 PM, Robert Haas <robertmh...@gmail.com> wrote: >>> Sorry, I don't understand specifically what you're looking for. I >>> provided latency percentiles in the last email; what else do you want? >> >> I think he wants how many waits were there that were between 0 and 1s >> how many between 1s and 2s, etc. Mathematically it's equivalent but I >> also have trouble visualizing just how much improvement is represented >> by 90th percentile dropping from 1688 to 1620 (ms?) > > Yes, milliseconds. Sorry for leaving out that detail. I've run these > scripts so many times that my eyes are crossing. Here are the > latencies, bucketized by seconds, first for master and then for the > patch, on the same test run as before: > > 0 26179411 > 1 3642 > 2 660 > 3 374 > 4 166 > 5 356 > 6 41 > 7 8 > 8 56 > 9 0 > 10 0 > 11 21 > 12 11 > > 0 26199130 > 1 4840 > 2 267 > 3 290 > 4 40 > 5 77 > 6 36 > 7 3 > 8 2 > 9 33 > 10 37 > 11 2 > 12 1 > 13 4 > 14 5 > 15 3 > 16 0 > 17 1 > 18 1 > 19 1 > > I'm not sure I find those numbers all that helpful, but there they > are. There are a couple of outliers beyond 12 s on the patched run, > but I wouldn't read anything into that; the absolute worst values > bounce around a lot from test to test. However, note that every > bucket between 2s and 8s improves, sometimes dramatically.
However, if it "improved" a bucket by pushing the things out of it into a higher bucket, that is not really an improvement. At 8 seconds *or higher*, for example, it goes from 88 things in master to 90 things in the patch. Maybe something like a Kaplan-Meier survival curve analysis would be the way to go (where a long transaction "survival" is bad). But probably overkill. What were full_page_writes and wal_buffers set to for these runs? > It's worth > keeping in mind here that the system is under extreme I/O strain on > this test, and the kernel responds by forcing user processes to sleep > when they try to do I/O. Should the tests be dialed back a bit so that the I/O strain is less extreme? Analysis is probably best done right after where the scalability knee is, not long after that point where the server has already collapsed to a quivering mass. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers