On 2013-01-19 17:33:05 -0500, Steve Singer wrote: > On 13-01-09 03:07 PM, Tom Lane wrote: > >Andres Freund <and...@2ndquadrant.com> writes: > >>Well, I *did* benchmark it as noted elsewhere in the thread, but thats > >>obviously just machine (E5520 x 2) with one rather restricted workload > >>(pgbench -S -jc 40 -T60). At least its rather palloc heavy. > >>Here are the numbers: > >>before: > >>#101646.763208 101350.361595 101421.425668 101571.211688 101862.172051 > >>101449.857665 > >>after: > >>#101553.596257 102132.277795 101528.816229 101733.541792 101438.531618 > >>101673.400992 > >>So on my system if there is a difference, its positive (0.12%). > >pgbench-based testing doesn't fill me with a lot of confidence for this > >--- its numbers contain a lot of communication overhead, not to mention > >that pgbench itself can be a bottleneck. It struck me that we have a > >recent test case that's known to be really palloc-intensive, namely > >Pavel's example here: > >http://www.postgresql.org/message-id/CAFj8pRCKfoz6L82PovLXNK-1JL=jzjwat8e2bd2pwnkm7i7...@mail.gmail.com > > > >I set up a non-cassert build of commit > >78a5e738e97b4dda89e1bfea60675bcf15f25994 (ie, just before the patch that > >reduced the data-copying overhead for that). On my Fedora 16 machine > >(dual 2.0GHz Xeon E5503, gcc version 4.6.3 20120306 (Red Hat 4.6.3-2)) > >I get a runtime for Pavel's example of 17023 msec (average over five > >runs). I then applied oprofile and got a breakdown like this: ... > > > >The number of calls of AllocSetAlloc certainly hasn't changed at all, so > >how did that get faster? > > > >I notice that the postgres executable is about 0.2% smaller, presumably > >because a whole lot of inlined fetches of CurrentMemoryContext are gone. > >This makes me wonder if my result is due to chance improvements of cache > >line alignment for inner loops. > > > >I would like to know if other people get comparable results on other > >hardware (non-Intel hardware would be especially interesting). If this > >result holds up across a range of platforms, I'll withdraw my objection > >to making palloc a plain function. > > > > regards, tom lane > > > > Sorry for the delay I only read this thread today. > > > I just tried Pawel's test on a POWER5 machine with an older version of gcc > (see the grebe buildfarm animal for details) > > 78a5e738e: 37874.855 (average of 6 runs) > 78a5e738 + palloc.h + mcxt.c: 38076.8035 > > The functions do seem to slightly slow things down on POWER. I haven't > bothered to run oprofile or tprof to get a breakdown of the functions > since Andres has already removed this from his patch.
I haven't removed it from the patch afaik, so it would be great to get a profile here! Its "only" for xlogdump, but that tool helped me immensely and I don't want to maintain it independently... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers