Re: [HACKERS] Load Distributed Checkpoints test results

Greg Smith Wed, 20 Jun 2007 13:10:35 -0700

On Wed, 20 Jun 2007, Heikki Linnakangas wrote:

Another series with 150 warehouses is more interesting. At that # ofwarehouses, the data disks are 100% busy according to iostat. The 90%percentile response times are somewhat higher with LDC, though thevariability in both the baseline and LDC test runs seem to be pretty high.

Great, this the exactly the behavior I had observed and wanted someoneelse to independantly run into. When you're in 100% disk busy land, LDCcan shift the distribution of bad transactions around in a way that somepeople may not be happy with, and that might represent a step backwardfrom the current code for them. I hope you can understand now why I'vebeen so vocal that it must be possible to pull this new behavior out sothe current form of checkpointing is still available.

While it shows up in the 90% figure, what happens is most obvious in theresponse time distribution graphs. Someone who is currently getting a runlike #295 right now: http://community.enterprisedb.com/ldc/295/rt.html

Might be really unhappy if they turn on LDC expecting to smooth outcheckpoints and get the shift of #296 instead:http://community.enterprisedb.com/ldc/296/rt.html

That is of course cherry-picking the most extreme examples. But itillustrates my concern about the possibility for LDC making things worseon a really overloaded system, which is kind of counter-intuitive becauseyou might expect that would be the best case for its improvements.

When I summarize the percentile behavior from your results with 150warehouses in a table like this:


Test    LDC %   90%
295     None    3.703
297     None    4.432
292     10      3.432
298     20      5.925
296     30      5.992
294     40      4.132

I think it does a better job of showing how LDC can shift the toppercentile around under heavy load, even though there are runs where it'sa clear improvement. Since there is so much variability in results whenyou get into this territory, you really need to run a lot of these teststo get a feel for the spread of behavior. I spent about a week ofcontinuously running tests stalking this bugger before I felt I'd mappedout the boundaries with my app. You've got your own priorities, but I'dsuggest you try to find enough time for a more exhaustive look at thisarea before nailing down the final form for the patch.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Load Distributed Checkpoints test results

Reply via email to