Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
"Stephen R. van den Berg" <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> ({ Node *__result__; \ > Please avoid identifiers starting with __ . Yeah, I had misremembered which way that rule went. It's "_result" as committed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
Tom Lane wrote: >Accordingly, I'm going to go ahead with this: >#ifdef __GNUC__ >/* With GCC, we can use a compound statement within an expression */ >#define newNode(size, tag) \ >({ Node *__result__; \ Please avoid identifiers starting with __ . ANSI reserves those for the implementation (the compiler and lib). Either use the old global name or something like newMode_result. -- Sincerely, Stephen R. van den Berg. Several ways to kill a programmer: kill -15, fair trial; kill -1, death by hanging; kill -2, suicide; kill -3, euthanasia; kill -9, execution. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
"Alex Hunsaker" <[EMAIL PROTECTED]> writes: > Hrm, I tried it on my x86_64 (quad core 2.66ghz, sorry no exotic > machines here :)) and did not see any real noticeable difference > between the two... Thanks. That confirms my feeling that I was seeing some weird artifact on my own x86_64 box. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
On Fri, Aug 29, 2008 at 1:16 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > In theory the above implementation of newNode should be a clear win, > so I'm thinking this result must be an artifact of some kind. I'm > going to go try it on PPC and HPPA machines next; does anyone want to > try it on something else? Hrm, I tried it on my x86_64 (quad core 2.66ghz, sorry no exotic machines here :)) and did not see any real noticeable difference between the two... Here is what I tried: (all with ./configure --enable-debug and make clean in between) CVS HEAD: tps = 30.375794 tps = 31.138078 tps = 30.928565 #define newNode(size, tag) \ ({ Node *newNodeMacroHolder; \ AssertMacro((size) >= sizeof(Node));/* need the tag, at least */ \ newNodeMacroHolder = (Node *) palloc0fast(size); \ newNodeMacroHolder->type = (tag); \ newNodeMacroHolder; \ }) tps = 30.814628 tps = 30.706080 tps = 31.10788 static inline Node *newNode(Size size, NodeTag tag) { Node *newNode; Assert(size >= sizeof(Node)); newNode = (Node *) palloc0(size); newNode->type = tag; return newNode; } tps = 30.317978 tps = 30.786187 tps = 30.747112 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
I wrote: > In theory the above implementation of newNode should be a clear win, > so I'm thinking this result must be an artifact of some kind. I'm > going to go try it on PPC and HPPA machines next; does anyone want to > try it on something else? Repeating the explain test on several machines, I get: HPPA/HPUX: gcc-specific macro is about 1.5% faster than CVS HEAD PPC/Darwin: gcc-specific macro is about 1% faster x86/Darwin: seems to be a dead heat So it seems that getting rid of the global doesn't really help on current Intel hardware, but it does help on other architectures. I'm pretty well convinced that the slowdown on my x86_64 machine is an artifact that will disappear as soon as anybody changes the backend code materially. Accordingly, I'm going to go ahead with this: #ifdef __GNUC__ /* With GCC, we can use a compound statement within an expression */ #define newNode(size, tag) \ ({ Node *__result__; \ AssertMacro((size) >= sizeof(Node));/* need the tag, at least */ \ __result__ = (Node *) palloc0fast(size); \ __result__->type = (tag); \ __result__; \ }) #else old form of macro here #endif /* __GNUC__ */ I'm not entirely sure whether ICC will take this, but the buildfarm will tell us soon enough. It's tempting to consider whether we should use this construct in place of "static inline" functions elsewhere, such as list_length(). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> I considered that one, but since part of my argument is that inlining >> this is a waste of code space, it seems like a better inlining >> technology isn't really the answer. > The compiler presumably has the intelligence and the command-line options to > control how much inlining one wants to do. But without any size vs. > performance measurements it is an idle discussion. Getting rid of a global > variable and macro ugliness is a worthwhile goal of its own. I got around to doing some experiments. The method suggested by Heikki (out-of-line subroutine for everything except the MemSetTest) reduces the size of the backend executable by about 0.5% (about 20K) in CVS HEAD on Fedora 9 x86_64, in a non-assert-enabled build. However it also makes it measurably slower. I couldn't detect any difference in a regular pgbench run, so instead I timed iterations of this: explain select * from tenk1 a join tenk1 b using(unique1) join tenk1 c on a.unique1 = c.unique2 join tenk1 d on a.unique1 = d.thousand join tenk1 e on a.unique1 = e.ten join tenk1 f on a.unique1 = f.tenthous join tenk1 g on a.unique1 = g.unique2 where exists(select 1 from int4_tbl where f1 = b.unique2); in the regression database. Put the above (as a single line!) into "explainjoin.sql" and try pgbench -c 1 -t 100 -n -f explainjoin.sql regression This is mostly stressing the planner, which is pretty newNode-heavy. I get consistently about 14.1 tps on straight CVS HEAD and about 13.8 with the partially out-of-line implementation. I also tried the "static inline" implementation, but that doesn't work at all: gcc refuses to inline it, which makes the palloc0fast call a dead loss. So indeed what we need here is a better inlining technology. I looked into using gcc's "a compound statement enclosed in parentheses is an expression" extension, thus: #define newNode(size, tag) \ ({ Node *newNodeMacroHolder; \ AssertMacro((size) >= sizeof(Node));/* need the tag, at least */ \ newNodeMacroHolder = (Node *) palloc0fast(size); \ newNodeMacroHolder->type = (tag); \ newNodeMacroHolder; \ }) This gets rid of the global, but incredibly, it's even slower: 13.5 tps on the explain test. I do not understand that result. I looked at the generated machine code to verify that it was what I expected, and indeed it's about the same as CVS HEAD except that there's no store-and-fetch into a global. Getting rid of the global variable accesses reduces the size of the backend by about 12K on this architecture, and the only theory I can think of is that that moves things around enough to make the instruction cache less efficient on some code path that this test happens to exercise heavily. In theory the above implementation of newNode should be a clear win, so I'm thinking this result must be an artifact of some kind. I'm going to go try it on PPC and HPPA machines next; does anyone want to try it on something else? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
On Wed, Aug 27, 2008 at 4:05 PM, Heikki Linnakangas <[EMAIL PROTECTED]> wrote: > Tom Lane wrote: > BTW, it would be nice to have a test case that shows it's worth messing with > that. At a guess the parser would be a good place to look. Perhaps a benchmark of a parsing a very large query? One thing I wonder is if you (or someone else) gets around to doing the simple stack allocator for the parser if it will make microoptimizations like this more visible. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
Tom Lane wrote: > I considered that one, but since part of my argument is that inlining > this is a waste of code space, it seems like a better inlining > technology isn't really the answer. The compiler presumably has the intelligence and the command-line options to control how much inlining one wants to do. But without any size vs. performance measurements it is an idle discussion. Getting rid of a global variable and macro ugliness is a worthwhile goal of its own. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
Tom Lane wrote: I'll have a go at this later ... unless you want to do it? Nah, I don't care enough. BTW, it would be nice to have a test case that shows it's worth messing with that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > Well, we could still have the MemSetTest outside the function, and > evaluated at compile-time, if we provided an aligned and unaligned > version of newNode: Yeah, that should work fine, since we expect MemSetTest to reduce to a compile-time constant. I'll have a go at this later ... unless you want to do it? I suppose it could cause funny behavior due to the size being evaluated more than once, but the existing macro already does that when Asserts are on, so in practice I don't see an objection there. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
Tom Lane wrote: "Heikki Linnakangas" <[EMAIL PROTECTED]> writes: Note that the MemSetLoop macro used in palloc0fast is supposed to be evaluated at compile time, Oooh, good point, I had forgotten about that little detail. Yeah, we'll lose that optimization if we move the code out-of-line. Well, we could still have the MemSetTest outside the function, and evaluated at compile-time, if we provided an aligned and unaligned version of newNode: #define newNode(size, tag) \ ( \ AssertMacro((size) >= sizeof(Node)), /* need the tag, at least */ \ ( MemSetTest(0, sz) ? \ newNodeAligned(CurrentMemoryContext, sz, tag) : \ newNodeNotAligned(CurrentMemoryContext, sz, tag)) And if you're worried about the function call overhead, newNode(Not)Aligned could have the contents of MemoryContextAllocZero inlined into it. Not sure how much optimization is worthwhile here.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> I happened to be looking at nodes.h and started wondering just how >> sane this coding really is: > Note that the MemSetLoop macro used in palloc0fast is supposed to be > evaluated at compile time, Oooh, good point, I had forgotten about that little detail. Yeah, we'll lose that optimization if we move the code out-of-line. So I guess a fallback position is #if __gcc__ use a "static inline" function, else the existing code. That would at least let us get rid of the global-variable assignment in gcc-based builds. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
Tom Lane wrote: I happened to be looking at nodes.h and started wondering just how sane this coding really is: extern PGDLLIMPORT Node *newNodeMacroHolder; #define newNode(size, tag) \ ( \ AssertMacro((size) >= sizeof(Node)),/* need the tag, at least */ \ newNodeMacroHolder = (Node *) palloc0fast(size), \ newNodeMacroHolder->type = (tag), \ newNodeMacroHolder \ ) Given that we're calling palloc, it's not clear that saving one level of function call is really buying much; and what it's costing us is a store to a global variable that the compiler has no way to optimize away. On a lot of platforms, accessing global variables isn't especially cheap. Also, considering that palloc0fast is a nontrivial macro, and that there are a LOT of uses of newNode(), we're paying rather a lot of code space for a pretty dubious savings. Note that the MemSetLoop macro used in palloc0fast is supposed to be evaluated at compile time, so the code space taken by that macro isn't that big. Turning newNode into function would force it to be evaluated at run-time instead. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
"Stephen R. van den Berg" <[EMAIL PROTECTED]> writes: > b. Create a function newNode() which is declared as inline, which >basically gives you the same code as under (a). I considered that one, but since part of my argument is that inlining this is a waste of code space, it seems like a better inlining technology isn't really the answer. The other two alternatives would force notational changes on all the callers, which doesn't seem appealing (there are close to 1400 calls of makeNode() in the backend...) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Is it really such a good thing for newNode() to be a macro?
Tom Lane wrote: >I happened to be looking at nodes.h and started wondering just how >sane this coding really is: >extern PGDLLIMPORT Node *newNodeMacroHolder; >#define newNode(size, tag) \ >( \ >AssertMacro((size) >= sizeof(Node)),/* need the tag, at least */ \ >newNodeMacroHolder = (Node *) palloc0fast(size), \ >newNodeMacroHolder->type = (tag), \ >newNodeMacroHolder \ >) >Given that we're calling palloc, it's not clear that saving one level of >function call is really buying much; and what it's costing us is a store >to a global variable that the compiler has no way to optimize away. >On a lot of platforms, accessing global variables isn't especially >cheap. Also, considering that palloc0fast is a nontrivial macro, and >that there are a LOT of uses of newNode(), we're paying rather a lot of >code space for a pretty dubious savings. Correct analysis, I'd say. >So I'm tempted to get rid of this and just make newNode() an out-of-line >function. Getting rid of the global variable is imperative. However, for the rest you'd have two alternate options (besides making it a normal function): a. Use macros like: #define makeNode(_type_,_variable_) \ newNode(sizeof(_type_), T_##_type_, _variable_) #define newNode(size, tag, variable)\ do { \ Node * newNodeMacroHolder;\ AssertMacro((size) >= sizeof(Node));/* need the tag, at least */ \ newNodeMacroHolder = (Node *) palloc0fast(size); \ newNodeMacroHolder->type = (tag); \ _variable_ = newNodeMacroHolder; \ } while(0) b. Create a function newNode() which is declared as inline, which basically gives you the same code as under (a). -- Sincerely, Stephen R. van den Berg. "Good moaning!" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Is it really such a good thing for newNode() to be a macro?
I happened to be looking at nodes.h and started wondering just how sane this coding really is: extern PGDLLIMPORT Node *newNodeMacroHolder; #define newNode(size, tag) \ ( \ AssertMacro((size) >= sizeof(Node)),/* need the tag, at least */ \ newNodeMacroHolder = (Node *) palloc0fast(size), \ newNodeMacroHolder->type = (tag), \ newNodeMacroHolder \ ) Given that we're calling palloc, it's not clear that saving one level of function call is really buying much; and what it's costing us is a store to a global variable that the compiler has no way to optimize away. On a lot of platforms, accessing global variables isn't especially cheap. Also, considering that palloc0fast is a nontrivial macro, and that there are a LOT of uses of newNode(), we're paying rather a lot of code space for a pretty dubious savings. So I'm tempted to get rid of this and just make newNode() an out-of-line function. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers