Re: [HACKERS] nodeAgg perf tweak
Neil Conway [EMAIL PROTECTED] writes: ISTM it would be reasonable to mandate that aggregate authors return one of three things from their state transition functions: (a) return the previous state value (b) return the next data item value (c) return some other value; if by a pass-by-reference type, allocated in CurrentMemoryContext In the case of (b), we need to copy it in advance_transition_function() to ensure it survives to the next invocation of the state transition function, but that should be doable. For both (a) and (c) we can assume the value has been allocated in the per-1000-rows-transition-function temporary context, and thus we need only copy it when we reset that context. Well, (1) I'm uncomfortable with mandating that aggregate functions do anything special, and (2) I think you lose much of the performance benefit as soon as you have to distinguish cases (b) and (c). Ideally you would use MemoryContextContains for this, but that doesn't work reliably on pointers that point to fields of a tuple. I like the approach shown in my prototype patch better, because it doesn't create any backwards-compatibility issues for existing aggregate functions; instead individual aggregates may be rewritten to avoid palloc's. (In fact, you can get to a zero-pallocs-per-cycle condition, rather than reducing from two to one which is the most that the approach you suggest could save.) I can reproduce the performance improvement with the patch you sent (I can repost numbers if you like, but your patch and mine resulted in about the same performance when I quickly tested it). Hmph. I'll have to repeat my test and figure out why I failed to see much benefit. I had sort of abandoned it in disgust after not getting results, but obviously I shoulda dug deeper. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: Please release (was Re: [HACKERS] nodeAgg perf tweak)
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: I've raised this before, but would like to suggest again that there might be virtue in branching earlier in the dev cycle - maybe around the time of the first beta. Given the amount of patching that's gone on, branching 8.1 earlier would have meant a great deal more work for the committers, in the form of double-patching stuff. And I don't know about the other committers, but I've certainly had no spare bandwidth for reviewing or applying 8.1-cycle patches anyway. So I think we made the right move to delay the branch. Core hasn't really talked about the point yet, but I suspect we'll make the branch after 8.0RC1 appears (which hopefully will be Real Soon). I do understand. Really. It just would be nice not to have development work queued for months every release cycle. But limited resources as always ... cheers andrew ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: Re: Please release (was Re: [HACKERS] nodeAgg perf tweak)
Neil Conway [EMAIL PROTECTED] wrote on 02.12.2004, 05:05:12: On Wed, 2004-12-01 at 19:34 +0100, [EMAIL PROTECTED] wrote: I regard performance testing as an essential part of the release process of any performance critical software. Most earlier beta releases were fixing functional problems and now the focus has moved to performance related issues. I don't agree. There is an important difference between fixing a performance regression and making a performance improvement. If a change made in 8.0 resulted in a significant, unintentional performance regression, I think there would be grounds for fixing it (assuming the fix was fairly noninvasive). The aggregate patch addresses a problem that has existed for a long time (7.4 at the least); I don't see the need to delay 8.0 to fix it. I understand this argument, but believe it misses out one consideration. IMHO the effects of this are somewhat problematic for performance-related issues: ISTM Performance -of any product- is mainly tested during late beta. This is the first real chance to see whether the software performs, since it was unstable or incomplete before that point. Unplanned performance opportunities arise at this time in the cycle. It seems strange to me to only fix regressions that emerge, but simply ignore positive opportunities that arise. For me, the subject under discussion here is what to do with these positive opportunities; I agree with you on handling regressions. If we choose not to take opportunities, then many performance concerns are effectively only ever fixed one release behind and the group is therefore making an unconscious decision that performance is not a top-tier function of the product. My own viewpoint comes from effectively being a leading-edge user (by proxy) - I don't claim any right to an opinion on this issue for any other reason. I do have respect for robustness concerns (why else would I do PITR?), and wouldn't argue this route if the code changes we're discussing weren't well isolated and IMHO fairly non-invasive. These decisions are always a risk-benefit trade-off. If I seem to speak too loudly, please forgive me. I raise this because I honestly believe the +20% perf tweak to be of benefit to the majority, and worth a comparatively short wait to implement and test. We haven't mentioned this, but it is easy to produce custom versions for people with these tweaks in, but that approach detracts from general adoption of the software. As a result, it seems likely that *not* implementing the patch could result in greater personal benefit to me, though that's not a stance I ever take. Could we implement this using a GUC, turned off by default, memory_protect_functions = true, which does not appear in the postgresql.conf? Any potential instability would then be protected by default and partially discouraged from use. That would allow people who understood the risk to disable the robustness in favour of performance - which would be relatively safe when using the built-in aggregate functions. I'll commit to doing some hard testing on this, if we agree to implement it. Thanks, Best Regards, Simon Riggs ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: Please release (was Re: [HACKERS] nodeAgg perf tweak)
On Wed, Dec 01, 2004 at 10:04:43PM -0500, Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: I've raised this before, but would like to suggest again that there might be virtue in branching earlier in the dev cycle - maybe around the time of the first beta. Given the amount of patching that's gone on, branching 8.1 earlier would have meant a great deal more work for the committers, in the form of double-patching stuff. That's because of CVS. Other systems deal with this issues in a more committer-friendly manner. You just have to look at how using BitKeeper has dramatically increased the development going on at the Linux kernel. Whatever your opinion is on Linux itself, that change alone is something to consider. Also, committers are not the only users of our SCM tool. Consider the autovacuum-backend-integration patch. Nobody reviewed it because it's a lot of hassle to take the patch, apply it locally, test it, and continue development. Because if later Matthew comes up with an updated patch, the reviewer has trouble merging both things, and re-reviewing the new patch (did he change this in the new patch, or is it the same as before?). Branching the whole thing could mean that I can integrate his own change history in my private tree, so I can see what went on since the last time; and he can see what changes I did and merge it; etc. The final step, which would be your own review and integration into the official tree, has a better patch to begin with. Which is the way Linux works. I think I will start using Subversion to track Postgres, just to see how it goes. -- Alvaro Herrera ([EMAIL PROTECTED]) If it wasn't for my companion, I believe I'd be having the time of my life (John Dunbar) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] nodeAgg perf tweak
On Thu, 2004-12-02 at 10:45 -0500, Tom Lane wrote: (2) I think you lose much of the performance benefit as soon as you have to distinguish cases (b) and (c). Ideally you would use MemoryContextContains for this, but that doesn't work reliably on pointers that point to fields of a tuple. Why wouldn't a simple comparison work? We're passing two arguments into the aggregate function: (a) corresponds to returning the first argument, and (b) corresponds to returning the second argument. If the aggregate wants to do something more than just return one of the arguments, they can copy/alloc in the current context. I like the approach shown in my prototype patch better, because it doesn't create any backwards-compatibility issues for existing aggregate functions; instead individual aggregates may be rewritten to avoid palloc's. Yeah, I like your approach as well (sorry, I had thought Simon's earlier suggestion along these lines would have required adding knowledge about builtin aggregates to advance_transition_function() itself; adding knowledge to the aggregate implementation is a lot nicer). -Neil ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] nodeAgg perf tweak
Neil Conway [EMAIL PROTECTED] writes: On Thu, 2004-12-02 at 10:45 -0500, Tom Lane wrote: (2) I think you lose much of the performance benefit as soon as you have to distinguish cases (b) and (c). Why wouldn't a simple comparison work? We're passing two arguments into the aggregate function: (a) corresponds to returning the first argument, and (b) corresponds to returning the second argument. True, but you still have to palloc if it returns the second argument. Yeah, I like your approach as well (sorry, I had thought Simon's earlier suggestion along these lines would have required adding knowledge about builtin aggregates to advance_transition_function() itself; adding knowledge to the aggregate implementation is a lot nicer). It needs documentation --- what I sent you was just a crude hack for testing, not something I would've committed as-is. IIRC, the spec I had in mind was if fcinfo-context is an AggState node then the function may assume it's being called as an aggregate's transition or final function. In this case, the first argument is certainly either an initial value or the output of a prior transition function call. The function may assume that it's OK to scribble on the first argument instead of allocating a fresh object for its result. One could also imagine that the transition and final functions could make a private agreement about the contents of the state value, such that it needn't be a strictly valid Datum --- for instance it might contain pointers to additional transient storage someplace. I think that at the time we were discussing such hacks in connection with user-defined aggregates that needed a large amount of state. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] nodeAgg perf tweak
On Thu, 2004-12-02 at 19:07 -0500, Tom Lane wrote: True, but you still have to palloc if it returns the second argument. Is that common? In any case, I don't see how you can _ever_ avoid a palloc if the aggregate returns the second argument. The second argument is in a per-tuple memory context: there's nothing the aggregate, or nodeAgg, can do about it. I think the tradeoffs between our patches are: - mine would apply to all aggregates, without the need to modify any of them individually - yours would mean that int8inc() and similar aggregates wouldn't ever need to do palloc(); mine would require a palloc() every k calls to the transition function. I don't really see this as a problem: in practice k will be sufficiently large that the palloc overhead should be negligible. -Neil ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] nodeAgg perf tweak
Neil Conway [EMAIL PROTECTED] writes: - yours would mean that int8inc() and similar aggregates wouldn't ever need to do palloc(); mine would require a palloc() every k calls to the transition function. No. The current code involves two pallocs per cycle, one inside the aggregate function to construct its result value, and then one in datumCopy to copy the result into the proper context. Your patch reduces that to 1 + 1/k pallocs per cycle, mine to zero. The fact that it's a central fix for all aggregate functions is definitely a nice feature of your approach, but I am concerned about the possible side-effects on user-defined aggregate functions that may not work as you expect them to. I think it's safer to keep the aggregate code behaving as-is and get the performance win in the individual functions. There are not that many aggregates that we really care that much about. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] nodeAgg perf tweak
On Thu, 2004-12-02 at 20:51 -0500, Tom Lane wrote: No. The current code involves two pallocs per cycle, one inside the aggregate function to construct its result value, and then one in datumCopy to copy the result into the proper context. Ah, true -- missed the fact that PG_RETURN_INT64() does a palloc(). (We really ought to fix that on 64-bit machines...) The fact that it's a central fix for all aggregate functions is definitely a nice feature of your approach, but I am concerned about the possible side-effects on user-defined aggregate functions that may not work as you expect them to. I think it's safer to keep the aggregate code behaving as-is and get the performance win in the individual functions. There are not that many aggregates that we really care that much about. Okay, fair enough :) BTW, the spec you posted in your previous message makes sense to me. -Neil ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] nodeAgg perf tweak
On Wed, 2004-12-01 at 02:48, Neil Conway wrote: I noticed that we have a bottleneck in aggregate performance in advance_transition_function(): according to callgrind, doing datumCopy() and pfree() for every tuple produced by the transition function is pretty expensive. Some queries bare this out: ... I can reproduce this performance difference consistently. I thought this might have been attributable to memory checking overhead because assertions were enabled, but that doesn't appear to be the case (the above results are without --enable-cassert). That looks like a useful performance gain, and count(*) is a weak point on the performance roadmap. Thanks. The attached patch invokes the transition function in the current memory context. I don't think that's right: a memory leak in an aggregate's transition function would be problematic when we're invoked from a per-query memory context, as is the case with advance_aggregates(). Perhaps we need an additional short-lived memory context in AggStatePerAggData: we could invoke the transition function in that context, and reset it once per, say, 1000 tuples. I'd be a little twitchy about new memory contexts at this stage of beta, but if the code is fairly well isolated that could be good. Alternatively we could just mandate that aggregate transition function's not leak memory and then invoke the transition function in, say, the aggregate's memory context, but that seems a little fragile. Would it possible to differentiate between well-known builtin functions and added transition functions? I realise nothing is that simple, but it seems we could trust some functions more than others. The trust level could be determined at planning time. [DB2 uses FENCED or UNFENCED declarations for this purpose, though the exact meaning is different to your proposal - I'm not suggesting adding external syntax though] A performance gain on the built-ins alone would satisfy 80% of users. -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] nodeAgg perf tweak
On Wed, 2004-12-01 at 08:25 +, Simon Riggs wrote: I'd be a little twitchy about new memory contexts at this stage of beta, but if the code is fairly well isolated that could be good. This would be for 8.1 anyway. Would it possible to differentiate between well-known builtin functions and added transition functions? IMHO, this would be ugly and add unnecessary complexity. I'd like to find a solution that actually fixes the problem, rather than just working around it in the common case. -Neil ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] nodeAgg perf tweak
On Wed, 2004-12-01 at 08:37, Neil Conway wrote: On Wed, 2004-12-01 at 08:25 +, Simon Riggs wrote: I'd be a little twitchy about new memory contexts at this stage of beta, but if the code is fairly well isolated that could be good. This would be for 8.1 anyway. Would it possible to differentiate between well-known builtin functions and added transition functions? IMHO, this would be ugly and add unnecessary complexity. I'd like to find a solution that actually fixes the problem, rather than just working around it in the common case. It would be my suggestion to implement the optimisation for the common case *now*, then fix the general case later. Please shave 20% off everybody's aggregation queries, ugly or not. You'll see a lot of happy people. -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Please release (was Re: [HACKERS] nodeAgg perf tweak)
On Wed, Dec 01, 2004 at 10:03:40AM +, Simon Riggs wrote: Please shave 20% off everybody's aggregation queries, ugly or not. You'll see a lot of happy people. When would the experimentation end, and 8.0 be finally released? There's real development waiting only for release to happen ... I have submitted three patches already that are pending for 8.1, and the code keeps drifting. There has to be a release soon. We can't keep in beta forever. Also, I think we should consider using a time-based release process instead of the unpredictable mess that it is used now. If hard development was done in branches and only merged when stable, we could do this. (This is also something that a better SCM could help us do, though GCC is living proof that it can be done with CVS too.) -- Alvaro Herrera ([EMAIL PROTECTED]) God is real, unless declared as int ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: Please release (was Re: [HACKERS] nodeAgg perf tweak)
Alvaro Herrera [EMAIL PROTECTED] wrote on 01.12.2004, 15:08:11: On Wed, Dec 01, 2004 at 10:03:40AM +, Simon Riggs wrote: Please shave 20% off everybody's aggregation queries, ugly or not. You'll see a lot of happy people. When would the experimentation end, and 8.0 be finally released? There's real development waiting only for release to happen ... Yes, I agree, there is real development waiting to happen. Users of PostgreSQL are waiting to use the new software. Which is why I'd like to see a 20% gain in aggregation performance happen. That's a huge gain, not just the little 1 percents we keep clawing at. For most of the people I work with, database performance is very important. I regard performance testing as an essential part of the release process of any performance critical software. Most earlier beta releases were fixing functional problems and now the focus has moved to performance related issues. Many people have trial-ported their software to r8.0 now that it works sufficiently well to do this and are now reporting performance issues. Also, many pure performance tests are being run on the various betas. I can only speak for myself, but I've spent a good deal of the last month trying to analyse performance issues, report or document them - though others have IMHO done much more than myself towards that goal. For me, performance is a function that requires testing too, and bug fixes for it should be allowed into the tail of the release. Where do you stop? When the further gains from doing so are small+20% tells me that point hasn't yet been reached. Moving between releases is a big pain for people, so many would tend to upgrade every other release. It will be some time before 8.0 makes it into other distributions. When it does, it would be best if it is as good as it can be. 8.1 is only a year away means it is light-years away. We've just gained what looks like 20%+ performance gain on the DBT-2 benchmark; with this perf tweak, we could see 20%+ performance gain on the DBT-3 benchmark also. I'd say if you asked most people whether we should wait another month, but if they did, they'd get a 20% boost on some of their worst queries, then they would choose to wait. 2% = don't bother; 20% = wait. I have submitted three patches already that are pending for 8.1, and the code keeps drifting. There has to be a release soon. We can't keep in beta forever. Code drift is a real pain and you and I were both tortured by this during June and July but I think we should stay in beta until we're done. Also, I think we should consider using a time-based release process instead of the unpredictable mess that it is used now. If hard development was done in branches and only merged when stable, we could do this. (This is also something that a better SCM could help us do, though GCC is living proof that it can be done with CVS too.) PostgreSQL development seems very fast to me, which is good. Timeboxing it would do little to improve the situation for prospective users of the software, whether or not it helps developers. I've no comment on the SCM design - I'll follow whatever is in place. Best Regards Simon Riggs ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: Please release (was Re: [HACKERS] nodeAgg perf tweak)
On Wed, 2004-12-01 at 11:08 -0300, Alvaro Herrera wrote: When would the experimentation end, and 8.0 be finally released? As I said, this is work for 8.1; I don't think it ought to affect the release schedule of 8.0 (perhaps in some marginal way because Tom might spend some time discussing issues with me, but I'll leave it up to Tom to decide how best to spend his time). There's real development waiting only for release to happen ... There is real development being done as we speak, that will be landed to 8.1 when it branches :) I think you've raised two distinct issues: (1) Release scheduling: time-based releases vs. feature-based releases, how long the beta period is, and so on. (2) Development methodology: how do we make it easy for people to engage in new development while the mainline is being stabilized for a release? If hard development was done in branches and only merged when stable, we could do this. I think this might be worth considering as an improvement to #2. What I've been experimenting with locally is doing development in a private VC system, and merging in mainline CVS changes every week or so. That allows me to keep long-running projects in their own private branchs and to take advantage of the VC system's merge support, without needing to change the work environment used by other developers. -Neil ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: Please release (was Re: [HACKERS] nodeAgg perf tweak)
Neil Conway said: On Wed, 2004-12-01 at 11:08 -0300, Alvaro Herrera wrote: There's real development waiting only for release to happen ... There is real development being done as we speak, that will be landed to 8.1 when it branches :) I've raised this before, but would like to suggest again that there might be virtue in branching earlier in the dev cycle - maybe around the time of the first beta. cheers andrew ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] nodeAgg perf tweak
Neil Conway [EMAIL PROTECTED] writes: The attached patch invokes the transition function in the current memory context. I don't think that's right: a memory leak in an aggregate's transition function would be problematic when we're invoked from a per-query memory context, as is the case with advance_aggregates(). Agreed. You can't really assume that the transition function is leak-free, particularly not when it's returning a pass-by-ref datatype. Perhaps we need an additional short-lived memory context in AggStatePerAggData: we could invoke the transition function in that context, and reset it once per, say, 1000 tuples. This seems like it might work. Instead of copying the result into the aggcontext on every cycle, we could copy it only when we intend to reset the working context. However there is still a risk: the function might return a pointer into the source tuple, not something that's in the working context at all. (See text_smaller() as one example.) This is problematic since the source tuple will go away on the next cycle. The existing copying code takes care of this case as well as the one where the result is in per_tuple_context. I'm a bit surprised that you measured a significant performance speedup from removing the copying step. Awhile ago I experimented with hacking the count() aggregate function internally to avoid pallocs, and was disappointed to measure no noticeable speedup at all. Digging in the list archives, it looks like I neglected to report that failure, but I do still have the patch and I attach it here. This was from late Dec '03 so it'd be against just-past-7.4 sources. I'd be interested to hear how this compares to what you did under your test conditions; maybe I was using a bogus test case. regards, tom lane *** src/backend/executor/nodeAgg.c.orig Sat Nov 29 14:51:48 2003 --- src/backend/executor/nodeAgg.c Sun Dec 21 16:02:25 2003 *** *** 350,356 */ /* MemSet(fcinfo, 0, sizeof(fcinfo)); */ ! fcinfo.context = NULL; fcinfo.resultinfo = NULL; fcinfo.isnull = false; --- 350,356 */ /* MemSet(fcinfo, 0, sizeof(fcinfo)); */ ! fcinfo.context = (void *) aggstate; fcinfo.resultinfo = NULL; fcinfo.isnull = false; *** *** 528,533 --- 528,534 FunctionCallInfoData fcinfo; MemSet(fcinfo, 0, sizeof(fcinfo)); + fcinfo.context = (void *) aggstate; fcinfo.flinfo = peraggstate-finalfn; fcinfo.nargs = 1; fcinfo.arg[0] = pergroupstate-transValue; *** /home/postgres/pgsql/src/backend/utils/adt/int8.c.orig Mon Dec 1 17:29:19 2003 --- /home/postgres/pgsql/src/backend/utils/adt/int8.c Sun Dec 21 16:03:43 2003 *** *** 17,22 --- 17,23 #include math.h #include libpq/pqformat.h + #include nodes/nodes.h #include utils/int8.h *** *** 565,573 Datum int8inc(PG_FUNCTION_ARGS) { ! int64 arg = PG_GETARG_INT64(0); ! PG_RETURN_INT64(arg + 1); } Datum --- 566,594 Datum int8inc(PG_FUNCTION_ARGS) { ! if (fcinfo-context != NULL IsA(fcinfo-context, AggState)) ! { ! /* !* Special case to avoid palloc overhead for COUNT(). Note: this !* assumes int8 is a pass-by-ref type; if we ever support pass-by-val !* int8, this should be ifdef'd out when int8 is pass-by-val. !* !* When called from nodeAgg, we know that the argument is modifiable !* local storage, so just update it in-place. !*/ ! int64 *arg = (int64 *) PG_GETARG_POINTER(0); ! ! (*arg)++; ! PG_RETURN_POINTER(arg); ! } ! else ! { ! /* Not called by nodeAgg, so just do it the dumb way */ ! int64 arg = PG_GETARG_INT64(0); ! ! PG_RETURN_INT64(arg + 1); ! } } Datum ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: Please release (was Re: [HACKERS] nodeAgg perf tweak)
Andrew Dunstan [EMAIL PROTECTED] writes: I've raised this before, but would like to suggest again that there might be virtue in branching earlier in the dev cycle - maybe around the time of the first beta. Given the amount of patching that's gone on, branching 8.1 earlier would have meant a great deal more work for the committers, in the form of double-patching stuff. And I don't know about the other committers, but I've certainly had no spare bandwidth for reviewing or applying 8.1-cycle patches anyway. So I think we made the right move to delay the branch. Core hasn't really talked about the point yet, but I suspect we'll make the branch after 8.0RC1 appears (which hopefully will be Real Soon). regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: Please release (was Re: [HACKERS] nodeAgg perf tweak)
On Wed, 2004-12-01 at 19:34 +0100, [EMAIL PROTECTED] wrote: I regard performance testing as an essential part of the release process of any performance critical software. Most earlier beta releases were fixing functional problems and now the focus has moved to performance related issues. I don't agree. There is an important difference between fixing a performance regression and making a performance improvement. If a change made in 8.0 resulted in a significant, unintentional performance regression, I think there would be grounds for fixing it (assuming the fix was fairly noninvasive). The aggregate patch addresses a problem that has existed for a long time (7.4 at the least); I don't see the need to delay 8.0 to fix it. -Neil ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] nodeAgg perf tweak
On Wed, 2004-12-01 at 15:54 -0500, Tom Lane wrote: This seems like it might work. Instead of copying the result into the aggcontext on every cycle, we could copy it only when we intend to reset the working context. Right. This is problematic since the source tuple will go away on the next cycle. The existing copying code takes care of this case as well as the one where the result is in per_tuple_context. ISTM it would be reasonable to mandate that aggregate authors return one of three things from their state transition functions: (a) return the previous state value (b) return the next data item value (c) return some other value; if by a pass-by-reference type, allocated in CurrentMemoryContext In the case of (b), we need to copy it in advance_transition_function() to ensure it survives to the next invocation of the state transition function, but that should be doable. For both (a) and (c) we can assume the value has been allocated in the per-1000-rows-transition-function temporary context, and thus we need only copy it when we reset that context. Digging in the list archives, it looks like I neglected to report that failure, but I do still have the patch and I attach it here. This was from late Dec '03 so it'd be against just-past-7.4 sources. I'd be interested to hear how this compares to what you did under your test conditions; maybe I was using a bogus test case. I can reproduce the performance improvement with the patch you sent (I can repost numbers if you like, but your patch and mine resulted in about the same performance when I quickly tested it). You can find the test data at: http://www.langille.org/watch_list_elements.sql.tgz Just load the .sql file, run ANALYZE, and try the queries I posted. -Neil ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] nodeAgg perf tweak
I noticed that we have a bottleneck in aggregate performance in advance_transition_function(): according to callgrind, doing datumCopy() and pfree() for every tuple produced by the transition function is pretty expensive. Some queries bare this out: dvl=# SELECT W.element_id, count(W.element_ID) FROM watch_list_element W GROUP by W.element_id ORDER by count(W.element_ID) LIMIT 5; element_id | count +--- 65525 | 1 163816 | 1 16341 | 1 131023 | 1 65469 | 1 (5 rows) Time: 176.723 ms dvl=# select count(*) from watch_list_element; count 138044 (1 row) Time: 94.246 ms I've attached a quick and dirty hack that avoids the need to palloc() and pfree() for every tuple produced by the aggregate's transition function. This results in: dvl=# SELECT W.element_id, count(W.element_ID) FROM watch_list_element W GROUP by W.element_id ORDER by count(W.element_ID) LIMIT 5; element_id | count +--- 65525 | 1 163816 | 1 16341 | 1 131023 | 1 65469 | 1 (5 rows) Time: 154.378 ms dvl=# select count(*) from watch_list_element; count 138044 (1 row) Time: 73.975 ms I can reproduce this performance difference consistently. I thought this might have been attributable to memory checking overhead because assertions were enabled, but that doesn't appear to be the case (the above results are without --enable-cassert). The attached patch invokes the transition function in the current memory context. I don't think that's right: a memory leak in an aggregate's transition function would be problematic when we're invoked from a per-query memory context, as is the case with advance_aggregates(). Perhaps we need an additional short-lived memory context in AggStatePerAggData: we could invoke the transition function in that context, and reset it once per, say, 1000 tuples. Alternatively we could just mandate that aggregate transition function's not leak memory and then invoke the transition function in, say, the aggregate's memory context, but that seems a little fragile. Comments? -Neil # # patch src/backend/executor/nodeAgg.c # from [851fd2d59a89ee2e3c23a1ca8fdbf4d466f98d26] #to [532ce100a0de1288fb23acc8de3a6bcac0982ec3] # --- src/backend/executor/nodeAgg.c +++ src/backend/executor/nodeAgg.c @@ -350,10 +350,10 @@ return; } } - +#if 0 /* We run the transition functions in per-input-tuple memory context */ oldContext = MemoryContextSwitchTo(aggstate-tmpcontext-ecxt_per_tuple_memory); - +#endif /* * OK to call the transition function * @@ -375,6 +375,7 @@ newVal = FunctionCallInvoke(fcinfo); +#if 0 /* * If pass-by-ref datatype, must copy the new value into aggcontext * and pfree the prior transValue. But if transfn returned a pointer @@ -393,11 +394,13 @@ if (!pergroupstate-transValueIsNull) pfree(DatumGetPointer(pergroupstate-transValue)); } +#endif pergroupstate-transValue = newVal; pergroupstate-transValueIsNull = fcinfo.isnull; - +#if 0 MemoryContextSwitchTo(oldContext); +#endif } /* ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] nodeAgg perf tweak
Neil Conway [EMAIL PROTECTED] writes: I've attached a quick and dirty hack that avoids the need to palloc() and pfree() for every tuple produced by the aggregate's transition function. And how badly does it leak memory? I do not believe this patch is tenable. Something that occurred to me the other morning in the shower is that we could trivially inline MemoryContextSwitchTo() when using gcc, much as you did for list_length(). I haven't gotten around to doing it but it seems like a free percent-or-two improvement. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] nodeAgg perf tweak
On Tue, 2004-11-30 at 23:15 -0500, Tom Lane wrote: And how badly does it leak memory? I do not believe this patch is tenable. Did you read the rest of my mail? Something that occurred to me the other morning in the shower is that we could trivially inline MemoryContextSwitchTo() when using gcc, much as you did for list_length(). I haven't gotten around to doing it but it seems like a free percent-or-two improvement. Yeah, it actually occurred to me as well this would be worth doing. It's not relevant to this particular example though. -Neil ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]