Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Fri, Jan 28, 2011 at 04:49:39PM -0500, Noah Misch wrote: On Fri, Jan 28, 2011 at 01:52:32PM -0500, Robert Haas wrote: On Wed, Jan 26, 2011 at 6:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm not sure how important that concern is though, because it's hard to see how any such change wouldn't break existing cast implementation functions anyway. ?If the API for length-coercing cast functions changes, breaking their helper functions too hardly seems like an issue. Or are you saying you want to punt this whole proposal till after that happens? ?I had muttered something of the sort way upthread, but I didn't think anyone else thought that way. I've been thinking about this patch a little bit more and I'm coming around to the viewpoint that we should mark this (and the successor patches in the same series) Returned with Feedback, and revisit the issue for 9.2. This is just. One other thing: #7 does not depend on #3,4,5,6 or any design problems raised thus far, so there's no need to treat it the same as that group. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Fri, Jan 28, 2011 at 4:49 PM, Noah Misch n...@leadboat.com wrote: I'm not necessarily signing on to the viewpoint that we should wait to do any of this work until after we refactor typemods, but it does strike me that the fact that Tom and I have completely different ideas of how this will interact with future changes in this area probably means we need to take some more time to talk about what those future enhancements might look like, rather than rushing something now and maybe regretting it later. We may not need to actually do all that work before we do this, but it sounds like at a minimum we need some agreement on what the design should look like. I've deferred comment due to my obvious bias, but I can't see any sense in blocking a change like this one on account of the exceptionally-preliminary discussions about enriching typmod. Suppose, like my original design, we make no provision to insulate against future typmod-related changes. The number of interfaces that deal in typmod are so great that the marginal effort to update the new ones will be irrelevant. I still like Tom's idea of an Expr-Expr interface. I like it because it opens more opportunities now, not because it will eliminate some modicum of effort from an enriched-typmod implementation. Once we add syntax to support this feature, we have to support that syntax for an extremely long time. We can't just remove it in the next release and replace it with something else - pg_dump has to still work, for example, and we have to able to reload whatever it produces. Adding a user-visible API that we may want to turn around and change *next release* is just a bad idea. If we were talking about implementing this through some sort of hard-coded internal list of types, it wouldn't be quite so much of an issue, but that's not where we're at. Moreover, I fear that injecting this into eval_const_expressions() is adding a frammish that has no practical utility apart from this case, but we still have to pay the overhead; even Tom expressed some initial doubt about whether that made sense, and I'm certainly not sold on it either. I don't see any particular reason why we can't resolve all of these issues, but it's going to take more time than we have right now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Sat, Jan 29, 2011 at 3:26 AM, Noah Misch n...@leadboat.com wrote: One other thing: #7 does not depend on #3,4,5,6 or any design problems raised thus far, so there's no need to treat it the same as that group. Well, if you want to post an updated patch that's independent of the rest of the series, I guess you can... but I think the chances of that getting applied for this release are pretty much zero. That patch relies on some subtle changes to the contract implied by an operator family and what looks at first blush like a lot of grotty hackery. I can't get very excited about spending a lot of time on that right now, especially given the amount of difficulty we've had reaching a meeting of the minds on cases that don't have those problems. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 6:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm not sure how important that concern is though, because it's hard to see how any such change wouldn't break existing cast implementation functions anyway. If the API for length-coercing cast functions changes, breaking their helper functions too hardly seems like an issue. Or are you saying you want to punt this whole proposal till after that happens? I had muttered something of the sort way upthread, but I didn't think anyone else thought that way. I've been thinking about this patch a little bit more and I'm coming around to the viewpoint that we should mark this (and the successor patches in the same series) Returned with Feedback, and revisit the issue for 9.2. I'm not necessarily signing on to the viewpoint that we should wait to do any of this work until after we refactor typemods, but it does strike me that the fact that Tom and I have completely different ideas of how this will interact with future changes in this area probably means we need to take some more time to talk about what those future enhancements might look like, rather than rushing something now and maybe regretting it later. We may not need to actually do all that work before we do this, but it sounds like at a minimum we need some agreement on what the design should look like. I think there is still hope for ALTER TYPE 2 (really ALTER TABLE 2 or SET DATA TYPE 2; it's misnamed) to be committed this cycle and I'll continue reviewing that one to see whether we can get at least that much done for 9.1. But I think the rest of this just needs more time than we have to give it right now. There are more than 60 patches left open in the CommitFest at this point, and we have less than three weeks left. If we don't focus our efforts on the things where we have clear agreement and good, finished code, we're going to end up either dragging this CommitFest out for months or else getting very little done for the next few weeks and then indiscriminately punting whatever is left over at the end. I don't want to do either of those things, so that means it's time to start making some tough choices, and unfortunately I think this is one that's going to have to get cut, for lack of agreement on the basic design. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Fri, Jan 28, 2011 at 01:52:32PM -0500, Robert Haas wrote: On Wed, Jan 26, 2011 at 6:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm not sure how important that concern is though, because it's hard to see how any such change wouldn't break existing cast implementation functions anyway. ?If the API for length-coercing cast functions changes, breaking their helper functions too hardly seems like an issue. Or are you saying you want to punt this whole proposal till after that happens? ?I had muttered something of the sort way upthread, but I didn't think anyone else thought that way. I've been thinking about this patch a little bit more and I'm coming around to the viewpoint that we should mark this (and the successor patches in the same series) Returned with Feedback, and revisit the issue for 9.2. This is just. I'm not necessarily signing on to the viewpoint that we should wait to do any of this work until after we refactor typemods, but it does strike me that the fact that Tom and I have completely different ideas of how this will interact with future changes in this area probably means we need to take some more time to talk about what those future enhancements might look like, rather than rushing something now and maybe regretting it later. We may not need to actually do all that work before we do this, but it sounds like at a minimum we need some agreement on what the design should look like. I've deferred comment due to my obvious bias, but I can't see any sense in blocking a change like this one on account of the exceptionally-preliminary discussions about enriching typmod. Suppose, like my original design, we make no provision to insulate against future typmod-related changes. The number of interfaces that deal in typmod are so great that the marginal effort to update the new ones will be irrelevant. I still like Tom's idea of an Expr-Expr interface. I like it because it opens more opportunities now, not because it will eliminate some modicum of effort from an enriched-typmod implementation. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Thu, Jan 27, 2011 at 1:14 AM, Noah Misch n...@leadboat.com wrote: Based on downthread discussion, I figure this will all change a good deal. I'll still briefly explain the patch as written. Most of the patch is plumbing to support the new syntax, catalog entries, and FuncExpr field. The important changes are in parse_coerce.c. I modified find_coercion_pathway() and find_typmod_coercion_function() to retrieve pg_cast.castexemptor alongside pg_cast.castfunc. Their callers (coerce_type() and coerce_type_typmod(), respectively) then call the new apply_exemptor_function(), which calls the exemptor function, if any, returns the value to place in FuncExpr.funcexempt, and possibly updates the CoercionPathType the caller is about to use. build_coercion_expression(), unchanged except to populate FuncExpr.funcexempt, remains responsible for creating the appropriate node (RelabelType, FuncExpr). Finally, I change GetCoerceExemptions to use FuncExpr.funcexempt. OK. I was thinking that instead moving this into eval_const_expressions(), we just make the logic in find_coercion_pathway() call the exemptor function (or whatever we call it) right around here: switch (castForm-castmethod) { case COERCION_METHOD_FUNCTION: result = COERCION_PATH_FUNC; *funcid = castForm-castfunc; break; case COERCION_METHOD_INOUT: result = COERCION_PATH_COERCEVIAIO; break; case COERCION_METHOD_BINARY: result = COERCION_PATH_RELABELTYPE; break; default: elog(ERROR, unrecognized castmethod: %d, (int) castForm-castmethod); break; } If it's COERCION_METHOD_FUNCTION, then instead of unconditionally setting the result to COERCION_PATH_FUNC, we inquire - based on the typemods - whether it's OK to downgrade to a COERCION_PATH_RELABELTYPE. The only fly in the ointment is that find_coercion_pathway() doesn't current get the typemods. Not sure how ugly that would be to fix. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Robert Haas robertmh...@gmail.com writes: OK. I was thinking that instead moving this into eval_const_expressions(), we just make the logic in find_coercion_pathway() call the exemptor function (or whatever we call it) right around here: No, no, no, no. Didn't you read yesterday's discussion? parse_coerce is entirely the wrong place for this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Thu, Jan 27, 2011 at 9:46 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: OK. I was thinking that instead moving this into eval_const_expressions(), we just make the logic in find_coercion_pathway() call the exemptor function (or whatever we call it) right around here: No, no, no, no. Didn't you read yesterday's discussion? parse_coerce is entirely the wrong place for this. I not only read it, I participated in it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Thu, Jan 27, 2011 at 10:42 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 27, 2011 at 9:46 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: OK. I was thinking that instead moving this into eval_const_expressions(), we just make the logic in find_coercion_pathway() call the exemptor function (or whatever we call it) right around here: No, no, no, no. Didn't you read yesterday's discussion? parse_coerce is entirely the wrong place for this. I not only read it, I participated in it. To put that another way, there's a difference between reading something and agreeing with it. I did the former, but not the latter. It seems to me that this discussion is unnecessarily antagonistic. Is it not OK for me to have a different idea about which way to go with something than you do? My personal view is that we ought to try to be increasing the number of places where type modifiers behave like they're really part of the type, rather than being an afterthought that we deal with when convenient and otherwise ignore. Otherwise, I see the chances of any substantive improvements in our type system as being just about zero. However, the larger point here is simply this: I'm trying to make some progress on reviewing and, where appropriate, committing the patches that were submitted for this CommitFest. I'd like to try to avoid the phenomenon where we're tripping over each other's feet. We're not making out very well on that front at the moment. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Robert Haas robertmh...@gmail.com writes: Is it not OK for me to have a different idea about which way to go with something than you do? Well, in this case I firmly believe that you're mistaken. My personal view is that we ought to try to be increasing the number of places where type modifiers behave like they're really part of the type, rather than being an afterthought that we deal with when convenient and otherwise ignore. And this argument is 100% irrelevant to the problem. The problem is that you want to put an optimization into the wrong phase of processing. That is going to hurt us, tomorrow if not today, and it has got *no* redeeming social value in terms of beng more flexible about what typmods are or how well integrated they are. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Thu, Jan 27, 2011 at 11:35 AM, Tom Lane t...@sss.pgh.pa.us wrote: My personal view is that we ought to try to be increasing the number of places where type modifiers behave like they're really part of the type, rather than being an afterthought that we deal with when convenient and otherwise ignore. And this argument is 100% irrelevant to the problem. The problem is that you want to put an optimization into the wrong phase of processing. That is going to hurt us, tomorrow if not today, and it has got *no* redeeming social value in terms of beng more flexible about what typmods are or how well integrated they are. The only thing we're deciding here is whether or not a cast requires a function. That's a function of the type OIDs and the typemods. I don't see why it's wrong to do the portion that involves the types in the same place as the portion that involves the typemods. Perhaps you could explain. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Thu, Jan 27, 2011 at 09:02:26AM -0500, Robert Haas wrote: OK. I was thinking that instead moving this into eval_const_expressions(), we just make the logic in find_coercion_pathway() call the exemptor function (or whatever we call it) right around here: switch (castForm-castmethod) { case COERCION_METHOD_FUNCTION: result = COERCION_PATH_FUNC; *funcid = castForm-castfunc; break; case COERCION_METHOD_INOUT: result = COERCION_PATH_COERCEVIAIO; break; case COERCION_METHOD_BINARY: result = COERCION_PATH_RELABELTYPE; break; default: elog(ERROR, unrecognized castmethod: %d, (int) castForm-castmethod); break; } If it's COERCION_METHOD_FUNCTION, then instead of unconditionally setting the result to COERCION_PATH_FUNC, we inquire - based on the typemods - whether it's OK to downgrade to a COERCION_PATH_RELABELTYPE. The only fly in the ointment is that find_coercion_pathway() doesn't current get the typemods. Not sure how ugly that would be to fix. Basically, this is a stylistic variation of the approach from my original at3 patch. I believe I looked at that particular positioning and decided that the extra arguments and effects on non-parse_coerce.c callers were undesirable. Debatable as any style issue, though. Note that you need to do the same thing in find_typmod_coercion_function(). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Mon, Jan 24, 2011 at 11:13 PM, Noah Misch n...@leadboat.com wrote: This helps on conversions like varchar(4)-varchar(8) and text-xml. I've read through this patch somewhat. As I believe Tom also commented previously, exemptor is a pretty bad choice of name. I spent awhile with a thesaurus before coining that. Since Tom's comment, I've been hoping the next person to complain would at least suggest a better name. Shooting in the dark on a nomenclature issue isn't fun, and I'm not picky. I can't speak for Tom, but I guess my gripe about that name is that we seem to view a cast as either WITH FUNCTION or WITHOUT FUNCTION. A cast-without-function is trivial from an implementation point of view, but it's still a cast, whereas the word exempt seems to imply that you're skipping the cast altogether. A side issue is that I really want to avoid adding a new parser keyword for this. It doesn't bother me too much to add keywords for really important and user-visible features, but when we're adding stuff that's only going to be used by 0.1% of our users it's really better if we can avoid it, because it slows down the parser. Maybe we could do something like: CREATE CAST (source_type AS target_type) WITH FUNCTION function_name (argument_type, [, ...]) [ ANALYZE USING function_name ] [ AS ASSIGNMENT | AS IMPLICIT ] Presumably, we wouldn't need the ANALYZE USING clause for the WITHOUT FUNCTION case, since the answer is a foregone conclusion in that case. We could possibly support it also for WITH INOUT, but I'm not sure whether the real-world utility is more than zero. Internally, we could refer to a function of this type as a cast analyzer. Don't take the above as Gospel truth, it's just an idea from one person on one day. More generally, I think this patch is a case of coding something complicated without first achieving consensus on what the behavior should be. I think we need to address that problem first, and then decide what code needs to be written, and then write the code. Well, that's why I started the design thread Avoiding rewrite in ALTER TABLE ALTER TYPE before writing any code. That thread petered out rather than reach any consensus. What would you have done then? Let's defer this question until we're more on the same page about the rest of this. It's obvious I'm not completely understanding this patch... I think what this patch is trying to do is tag the call to the cast function with the information that we might not need to call it, but ISTM it would be better to just notice that the function call is unnecessary and insert a RelabelType node instead. This patch does exactly that for varchar(4) - varchar(8) and similar cases. Oh, uh, err... OK, I obviously haven't understood it well enough. I'll look at it some more, but if there's anything you can write up to explain what you changed and why, that would be helpful. I think I'm also having trouble with the fact that these patches don't apply cleanly against the master branch, because they're stacked up on each other. Maybe this will be easier once we get more of the ALTER TYPE 2 stuff in. *reads archives* In fact, Tom already made pretty much exactly this suggestion, on a thread entitled Avoiding rewrite in ALTER TABLE ALTER TYPE. The only possible objection I can see to that line of attack is that it's not clear what the API should be, or whether there might be too much runtime overhead for the amount of benefit that can be obtained. I believe the patch does implement Tom's suggestion, at least in spirit. He mentioned expression_planner() as the place to do it. In principle, We could do it *right there* and avoid any changes to coerce_to_target_type(), but the overhead would increase. Specifically, we would be walking an already-built expression tree looking for casts to remove, probably doing a syscache lookup for every cast to grab the exemptor function OID. Doing it there would prevent us from directly helping or harming plain old queries. Since ExecRelCheck(), FormIndexDatum() and other notable paths call expression_planner(), we wouldn't want to casually add an expensive algorithm there. To avoid the extra syscache lookups, we'd piggyback off those already done in coerce_to_target_type(). That brings us to the current implementation. Better to make the entire decision in coerce_to_target_type() and never add the superfluous node to the expression tree in the first place. coerce_to_target_type() and callees seemed like the natural, low cost place to make the changes. By doing it that way, did I miss some important detail or implication of Tom's suggestion? I'm not sure. Tom? As for performance, coerce_to_target_type() is certainly in wide use, but it's not exactly a hot path. I prepared and ran the attached bench-coerce-worst.sql and bench-coerce-best.sql. Both are quite artificial. The first one basically asks Can we
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Robert Haas robertmh...@gmail.com writes: ... A side issue is that I really want to avoid adding a new parser keyword for this. It doesn't bother me too much to add keywords for really important and user-visible features, but when we're adding stuff that's only going to be used by 0.1% of our users it's really better if we can avoid it, because it slows down the parser. Maybe we could do something like: CREATE CAST (source_type AS target_type) WITH FUNCTION function_name (argument_type, [, ...]) [ ANALYZE USING function_name ] [ AS ASSIGNMENT | AS IMPLICIT ] I'm not terribly thrilled with the suggestion of ANALYZE here, because given the way we use that word elsewhere, people are likely to think it means something to do with statistics collection; or at least that it implies some deep and complicated analysis of the cast. I suggest using a phrase based on the idea that this function tells you whether you can skip the cast, or (if the sense is inverted) whether the cast has to be executed. SKIP IF function_name would be nice but SKIP isn't an extant keyword either. The best variant that I can come up with after a minute's contemplation of kwlist.h is NO WORK IF function_name. If you didn't mind inverting the sense of the result then we could use EXECUTE IF function_name. Internally, we could refer to a function of this type as a cast analyzer. Another possibility is to call it a cast checker and use CHECK USING. But I'm not sure that's much better than ANALYZE in terms of conveying the purpose to a newbie. I believe the patch does implement Tom's suggestion, at least in spirit. He mentioned expression_planner() as the place to do it. In principle, We could do it *right there* and avoid any changes to coerce_to_target_type(), but the overhead would increase. Specifically, we would be walking an already-built expression tree looking for casts to remove, probably doing a syscache lookup for every cast to grab the exemptor function OID. No no no no. The right place to do it is during expression simplification in eval_const_expressions(). We are already disassembling and reassembling the tree, and looking up functions, in that pass. Detecting that a call is a no-op fits into that very naturally. Think of it as an alternative to inline_function(). One point worth making here is that eval_const_expressions() does not currently care very much whether a function call came from cast syntax or explicitly. It might be worth thinking about whether we want to have a generic this-function-call-is-a-no-op simplifier hook available for *all* functions not just those that are casts. I'm not sure we want to pay the overhead of another pg_proc column, but it's something to think about. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: ... A side issue is that I really want to avoid adding a new parser keyword for this. It doesn't bother me too much to add keywords for really important and user-visible features, but when we're adding stuff that's only going to be used by 0.1% of our users it's really better if we can avoid it, because it slows down the parser. Maybe we could do something like: CREATE CAST (source_type AS target_type) WITH FUNCTION function_name (argument_type, [, ...]) [ ANALYZE USING function_name ] [ AS ASSIGNMENT | AS IMPLICIT ] I'm not terribly thrilled with the suggestion of ANALYZE here, because given the way we use that word elsewhere, people are likely to think it means something to do with statistics collection; or at least that it implies some deep and complicated analysis of the cast. I suggest using a phrase based on the idea that this function tells you whether you can skip the cast, or (if the sense is inverted) whether the cast has to be executed. SKIP IF function_name would be nice but SKIP isn't an extant keyword either. The best variant that I can come up with after a minute's contemplation of kwlist.h is NO WORK IF function_name. If you didn't mind inverting the sense of the result then we could use EXECUTE IF function_name. What about borrowing from the trigger syntax? WITH FUNCTION function_name (argument_type, [...]) WHEN function_that_returns_true_when_the_call_is_needed One point worth making here is that eval_const_expressions() does not currently care very much whether a function call came from cast syntax or explicitly. It might be worth thinking about whether we want to have a generic this-function-call-is-a-no-op simplifier hook available for *all* functions not just those that are casts. I'm not sure we want to pay the overhead of another pg_proc column, but it's something to think about. It's not obvious to me that it has a use case outside of casts, but it's certainly possible I'm missing something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Robert Haas robertmh...@gmail.com writes: On Wed, Jan 26, 2011 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: If you didn't mind inverting the sense of the result then we could use EXECUTE IF function_name. What about borrowing from the trigger syntax? WITH FUNCTION function_name (argument_type, [...]) WHEN function_that_returns_true_when_the_call_is_needed That's a good thought. Or we could use WHEN NOT check_function if you want to keep to the negative-result semantics. One point worth making here is that eval_const_expressions() does not currently care very much whether a function call came from cast syntax or explicitly. It might be worth thinking about whether we want to have a generic this-function-call-is-a-no-op simplifier hook available for *all* functions not just those that are casts. I'm not sure we want to pay the overhead of another pg_proc column, but it's something to think about. It's not obvious to me that it has a use case outside of casts, but it's certainly possible I'm missing something. A possible example is simplifying X + 0 to X, or X * 0 to 0. I've never wanted to inject such datatype-specific knowledge directly into the planner, but if we viewed it as function-specific knowledge supplied by a per-function helper routine, maybe it would fly. Knowing that a cast function does nothing useful for particular cases could be seen as a special case of this type of simplification. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 3:08 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Jan 26, 2011 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: If you didn't mind inverting the sense of the result then we could use EXECUTE IF function_name. What about borrowing from the trigger syntax? WITH FUNCTION function_name (argument_type, [...]) WHEN function_that_returns_true_when_the_call_is_needed That's a good thought. Or we could use WHEN NOT check_function if you want to keep to the negative-result semantics. Seems a bit contorted; I don't see any particular reason to prefer positive vs negative semantics in this case. One point worth making here is that eval_const_expressions() does not currently care very much whether a function call came from cast syntax or explicitly. It might be worth thinking about whether we want to have a generic this-function-call-is-a-no-op simplifier hook available for *all* functions not just those that are casts. I'm not sure we want to pay the overhead of another pg_proc column, but it's something to think about. It's not obvious to me that it has a use case outside of casts, but it's certainly possible I'm missing something. A possible example is simplifying X + 0 to X, or X * 0 to 0. I've never wanted to inject such datatype-specific knowledge directly into the planner, but if we viewed it as function-specific knowledge supplied by a per-function helper routine, maybe it would fly. Knowing that a cast function does nothing useful for particular cases could be seen as a special case of this type of simplification. Oh, I see. The times I've seen an issue with those kinds of expressions have always been related to selectivity estimation. For example, you'd like to get a selectivity estimate of 1-nullfrac for (x+0)=x and 0 for (x+1)=x, and maybe (1-nullfrac)/2 for (x%2)=0. This would only handle the first of those cases, so I'm inclined to think it's too weak to have much general utility. The casting cases can, I think, have a much larger impact - they occur more often in practice, and if you can (e.g.) avoid an entire table rewrite, that's a pretty big deal. Another semi-related problem case I've run across is that CASE WHEN ... THEN 1 WHEN ... THEN 2 END ought to be known to only ever emit 1 and 2, and the selectivity of comparing that to some other value ought to be computed on that basis. But now I'm really wandering off into the weeds. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Robert Haas robertmh...@gmail.com writes: On Wed, Jan 26, 2011 at 3:08 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: It's not obvious to me that it has a use case outside of casts, but it's certainly possible I'm missing something. A possible example is simplifying X + 0 to X, or X * 0 to 0. Oh, I see. The times I've seen an issue with those kinds of expressions have always been related to selectivity estimation. Yeah, helping the planner recognize equivalent cases is at least as large a reason for wanting this as any direct savings of execution time. I don't mind confining the feature to casts to start with, but it might be a good idea to specify the check-function API in a way that would let it be plugged into a more generally available call-simplification hook later. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 4:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Jan 26, 2011 at 3:08 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: It's not obvious to me that it has a use case outside of casts, but it's certainly possible I'm missing something. A possible example is simplifying X + 0 to X, or X * 0 to 0. Oh, I see. The times I've seen an issue with those kinds of expressions have always been related to selectivity estimation. Yeah, helping the planner recognize equivalent cases is at least as large a reason for wanting this as any direct savings of execution time. Agreed. I don't mind confining the feature to casts to start with, but it might be a good idea to specify the check-function API in a way that would let it be plugged into a more generally available call-simplification hook later. Got any suggestions? My thought was that it should just get (type, typemod, type, typemod) and return a boolean. We could do something more complicated, like Expr - Expr, but I'm pretty unconvinced there's any value there. I'd like to see some kind of appropriate hook for interjecting selectivity estimates for cases that we currently can't recognize, but my gut feeling is that that's insufficiently related at the problem at hand to worry about it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Robert Haas robertmh...@gmail.com writes: On Wed, Jan 26, 2011 at 4:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: I don't mind confining the feature to casts to start with, but it might be a good idea to specify the check-function API in a way that would let it be plugged into a more generally available call-simplification hook later. Got any suggestions? My thought was that it should just get (type, typemod, type, typemod) and return a boolean. We could do something more complicated, like Expr - Expr, but I'm pretty unconvinced there's any value there. Well, (type, typemod, type, typemod) would be fine as long as the only case you're interested in optimizing is typemod-lengthening coercions. I think we're going to want the more general Expr-to-Expr capability eventually. I guess we could do the restricted case for now, with the idea that there could be a standard Expr-to-Expr interface function created later and installed into the generic hook facility for functions that are cast functions. That could look into pg_cast and make the more restricted call when appropriate. (The meat of this function would be the same code you'd have to add to eval_const_expressions anyway for the hard-wired version...) I'd like to see some kind of appropriate hook for interjecting selectivity estimates for cases that we currently can't recognize, but my gut feeling is that that's insufficiently related at the problem at hand to worry about it. Agreed, that seems a bit off-topic. There are ancient comments in the source code complaining about the lack of selectivity hooks for function (as opposed to operator) calls, but that's not what Noah signed up to fix. In any case I'm not sure that expression simplification is closely connected to selectivity estimation --- to my mind it's an independent process that is good to run first. As an example, the ALTER TABLE case has a use for the former but not the latter. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 5:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Jan 26, 2011 at 4:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: I don't mind confining the feature to casts to start with, but it might be a good idea to specify the check-function API in a way that would let it be plugged into a more generally available call-simplification hook later. Got any suggestions? My thought was that it should just get (type, typemod, type, typemod) and return a boolean. We could do something more complicated, like Expr - Expr, but I'm pretty unconvinced there's any value there. Well, (type, typemod, type, typemod) would be fine as long as the only case you're interested in optimizing is typemod-lengthening coercions. I think we're going to want the more general Expr-to-Expr capability eventually. I guess we could do the restricted case for now, with the idea that there could be a standard Expr-to-Expr interface function created later and installed into the generic hook facility for functions that are cast functions. That could look into pg_cast and make the more restricted call when appropriate. (The meat of this function would be the same code you'd have to add to eval_const_expressions anyway for the hard-wired version...) Well, if you're positive we're eventually going to want this in pg_proc, we may as well add it now. But I'm not too convinced it's the right general API. The number of people writing exactly x + 0 or x * 0 in a query has got to be vanishingly small; I'm not eager to add additional parse analysis time to every SQL statement that has a function in it just to detect those cases. Even slightly more complicated problems seem intractable - e.g. (x + 1) = x can be simplified to constant false, and NOT ((x + 1) = x) can be simplified to x IS NOT NULL, but under the proposed API those would have to hang off of =(int4,int4), which seems pretty darn ugly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Robert Haas robertmh...@gmail.com writes: Well, if you're positive we're eventually going to want this in pg_proc, we may as well add it now. But I'm not too convinced it's the right general API. The number of people writing exactly x + 0 or x * 0 in a query has got to be vanishingly small; I'm not eager to add additional parse analysis time to every SQL statement that has a function in it just to detect those cases. Actually, you've got that backwards: the facility I've got in mind would cost next to nothing when not used. The place where we'd want to insert this in eval_const_expressions has already got its hands on the relevant pg_proc row, so checking for a nonzero hook-function reference would be a matter of a couple of instructions. If we go with a pg_cast entry then we're going to have to add a pg_cast lookup for every cast, whether it turns out to be optimizable or not; which is going to cost quite a lot more. The intermediate hook function I was sketching might be worthwhile from a performance standpoint even if we don't expose the more general feature to users, just because it would be possible to avoid useless pg_cast lookups (by not installing the hook except on pg_proc entries for which there's a relevant CAST WHEN function to call). Even slightly more complicated problems seem intractable - e.g. (x + 1) = x can be simplified to constant false, and NOT ((x + 1) = x) can be simplified to x IS NOT NULL, but under the proposed API those would have to hang off of =(int4,int4), which seems pretty darn ugly. True, but where else are you going to hang them off of? Not that I was particularly thinking of doing either one of those. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 5:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Well, if you're positive we're eventually going to want this in pg_proc, we may as well add it now. But I'm not too convinced it's the right general API. The number of people writing exactly x + 0 or x * 0 in a query has got to be vanishingly small; I'm not eager to add additional parse analysis time to every SQL statement that has a function in it just to detect those cases. Actually, you've got that backwards: the facility I've got in mind would cost next to nothing when not used. The place where we'd want to insert this in eval_const_expressions has already got its hands on the relevant pg_proc row, so checking for a nonzero hook-function reference would be a matter of a couple of instructions. If we go with a pg_cast entry then we're going to have to add a pg_cast lookup for every cast, whether it turns out to be optimizable or not; which is going to cost quite a lot more. The intermediate hook function I was sketching might be worthwhile from a performance standpoint even if we don't expose the more general feature to users, just because it would be possible to avoid useless pg_cast lookups (by not installing the hook except on pg_proc entries for which there's a relevant CAST WHEN function to call). Oh, really? I was thinking the logic should go into find_coercion_pathway(). Even slightly more complicated problems seem intractable - e.g. (x + 1) = x can be simplified to constant false, and NOT ((x + 1) = x) can be simplified to x IS NOT NULL, but under the proposed API those would have to hang off of =(int4,int4), which seems pretty darn ugly. True, but where else are you going to hang them off of? Not that I was particularly thinking of doing either one of those. Beats me, just thinking out loud. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Robert Haas robertmh...@gmail.com writes: Oh, really? I was thinking the logic should go into find_coercion_pathway(). Well, I've been saying right along that it should be in eval_const_expressions. Putting this sort of optimization in the parser is invariably the wrong thing, because it fails to catch all the possibilities. As an example, inlining a SQL function could expose opportunities of this type. Another issue is that premature optimization in the parser creates headaches if conditions change such that a previous optimization is no longer valid --- you may have stored rules wherein the optimization was already applied. (Not sure that specific issue applies to casting, since we have no ALTER CAST commmand; but in general you want expression optimizations applied downstream from the rule rewriter not upstream.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 5:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Oh, really? I was thinking the logic should go into find_coercion_pathway(). Well, I've been saying right along that it should be in eval_const_expressions. Putting this sort of optimization in the parser is invariably the wrong thing, because it fails to catch all the possibilities. As an example, inlining a SQL function could expose opportunities of this type. Another issue is that premature optimization in the parser creates headaches if conditions change such that a previous optimization is no longer valid --- you may have stored rules wherein the optimization was already applied. (Not sure that specific issue applies to casting, since we have no ALTER CAST commmand; but in general you want expression optimizations applied downstream from the rule rewriter not upstream.) Well, I guess my thought was that we what we were doing is extending the coercion system to be able to make decisions based on both type OID and typemod. It seems very odd to make an initial decision based on type OID in one place and then have a completely different system for incorporating the typemod; and it also seems quite a bit more expensive, since you'd have to consider it for every function, not just casts. A further problem is that I don't think it'd play well at all with the richer-typemod concept we keep bandying about. If, for example, we represented all arrays with the same OID (getting rid of the double-entries in pg_type) and used some kind of augmented-typemod to store the number of dimensions and the base type, this would completely fall over. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
I wrote: ... Another issue is that premature optimization in the parser creates headaches if conditions change such that a previous optimization is no longer valid --- you may have stored rules wherein the optimization was already applied. (Not sure that specific issue applies to casting, since we have no ALTER CAST commmand; but in general you want expression optimizations applied downstream from the rule rewriter not upstream.) Actually, I can construct a concrete example where applying this optimization in the parser will do the wrong thing: CREATE TABLE base (f1 varchar(4)); CREATE VIEW vv AS SELECT f1::varchar(8) FROM base; ALTER TABLE base ALTER COLUMN f1 TYPE varchar(16); If the parser is taught to throw away useless length coercions, then the stored form of vv will contain no cast, and the results will be wrong after base's column is widened. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 05:55:37PM -0500, Tom Lane wrote: I wrote: ... Another issue is that premature optimization in the parser creates headaches if conditions change such that a previous optimization is no longer valid --- you may have stored rules wherein the optimization was already applied. (Not sure that specific issue applies to casting, since we have no ALTER CAST commmand; but in general you want expression optimizations applied downstream from the rule rewriter not upstream.) Actually, I can construct a concrete example where applying this optimization in the parser will do the wrong thing: CREATE TABLE base (f1 varchar(4)); CREATE VIEW vv AS SELECT f1::varchar(8) FROM base; ALTER TABLE base ALTER COLUMN f1 TYPE varchar(16); If the parser is taught to throw away useless length coercions, then the stored form of vv will contain no cast, and the results will be wrong after base's column is widened. We reject that: [local] test=# ALTER TABLE base ALTER COLUMN f1 TYPE varchar(16); ERROR: cannot alter type of a column used by a view or rule DETAIL: rule _RETURN on view vv depends on column f1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Robert Haas robertmh...@gmail.com writes: Well, I guess my thought was that we what we were doing is extending the coercion system to be able to make decisions based on both type OID and typemod. Well, that's an interesting thought, but the proposal at hand is far more limited than that --- it's only an optimization that can be applied in limited situations. If we want to do something like what you're suggesting here, it's not going to get done for 9.1. A further problem is that I don't think it'd play well at all with the richer-typemod concept we keep bandying about. If, for example, we represented all arrays with the same OID (getting rid of the double-entries in pg_type) and used some kind of augmented-typemod to store the number of dimensions and the base type, this would completely fall over. Your proposal would completely fall over, you mean. An Expr-Expr hook API wouldn't be affected at all. I'm not sure how important that concern is though, because it's hard to see how any such change wouldn't break existing cast implementation functions anyway. If the API for length-coercing cast functions changes, breaking their helper functions too hardly seems like an issue. Or are you saying you want to punt this whole proposal till after that happens? I had muttered something of the sort way upthread, but I didn't think anyone else thought that way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Noah Misch n...@leadboat.com writes: On Wed, Jan 26, 2011 at 05:55:37PM -0500, Tom Lane wrote: Actually, I can construct a concrete example where applying this optimization in the parser will do the wrong thing: CREATE TABLE base (f1 varchar(4)); CREATE VIEW vv AS SELECT f1::varchar(8) FROM base; ALTER TABLE base ALTER COLUMN f1 TYPE varchar(16); If the parser is taught to throw away useless length coercions, then the stored form of vv will contain no cast, and the results will be wrong after base's column is widened. We reject that: [local] test=# ALTER TABLE base ALTER COLUMN f1 TYPE varchar(16); ERROR: cannot alter type of a column used by a view or rule DETAIL: rule _RETURN on view vv depends on column f1 Nonetheless, the stored form of vv will contain no cast, which can hardly be thought safe here. Relaxing the restriction you cite is (or should be) on the TODO list, but we'll never be able to do it if the parser throws away relevant information. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 05:32:00PM -0500, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: Well, if you're positive we're eventually going to want this in pg_proc, we may as well add it now. But I'm not too convinced it's the right general API. The number of people writing exactly x + 0 or x * 0 in a query has got to be vanishingly small; I'm not eager to add additional parse analysis time to every SQL statement that has a function in it just to detect those cases. Actually, you've got that backwards: the facility I've got in mind would cost next to nothing when not used. The place where we'd want to insert this in eval_const_expressions has already got its hands on the relevant pg_proc row, so checking for a nonzero hook-function reference would be a matter of a couple of instructions. If we go with a pg_cast entry then we're going to have to add a pg_cast lookup for every cast, whether it turns out to be optimizable or not; which is going to cost quite a lot more. The intermediate hook function I was sketching might be worthwhile from a performance standpoint even if we don't expose the more general feature to users, just because it would be possible to avoid useless pg_cast lookups (by not installing the hook except on pg_proc entries for which there's a relevant CAST WHEN function to call). If we hook this into eval_const_expressions, it definitely seems cleaner to attach the auxiliary function to the pg_proc. Otherwise, we'd reconstruct which cast led to each function call -- is there even enough information available to do so unambiguously? Unlike something typmod-specific, these functions would effectively need to be written in C. Seems like a perfectly acceptable constraint, though. For the syntax, then, would a new common_func_opt_item of WHEN func fit? That covers fully-removable casts, but ALTER TABLE still needs to identify casts that may throw errors but never change the value's binary representation. Where does that fit? Another pg_proc column for a function called to answer that question, called only from an ALTER TABLE-specific code path? Thanks for the feedback/analysis. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Noah Misch n...@leadboat.com writes: If we hook this into eval_const_expressions, it definitely seems cleaner to attach the auxiliary function to the pg_proc. Otherwise, we'd reconstruct which cast led to each function call -- is there even enough information available to do so unambiguously? As far as that goes, yes there is --- otherwise ruleutils would be unable to reverse-list cast constructs. IIRC the function call is marked as being a cast, and you get the source and dest type IDs from ordinary exprType() inspection. For the syntax, then, would a new common_func_opt_item of WHEN func fit? WHEN is appropriate for the restricted CAST case, but it doesn't seem like le mot juste for a general function option, unless we restrict the helper function to return boolean which is not what I had in mind. That covers fully-removable casts, but ALTER TABLE still needs to identify casts that may throw errors but never change the value's binary representation. I remain completely unexcited about optimizing that case, especially if it doesn't fit into a general framework. The bang for the buck ratio is not good: too much work, too much uglification, too little return. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 06:29:57PM -0500, Tom Lane wrote: Noah Misch n...@leadboat.com writes: If we hook this into eval_const_expressions, it definitely seems cleaner to attach the auxiliary function to the pg_proc. Otherwise, we'd reconstruct which cast led to each function call -- is there even enough information available to do so unambiguously? As far as that goes, yes there is --- otherwise ruleutils would be unable to reverse-list cast constructs. IIRC the function call is marked as being a cast, and you get the source and dest type IDs from ordinary exprType() inspection. Good point. So it is, FuncExpr.funcformat conveys that. For the syntax, then, would a new common_func_opt_item of WHEN func fit? WHEN is appropriate for the restricted CAST case, but it doesn't seem like le mot juste for a general function option, unless we restrict the helper function to return boolean which is not what I had in mind. True. Uhh, PARSER MAPPING func? PLANS CONVERSION func? That covers fully-removable casts, but ALTER TABLE still needs to identify casts that may throw errors but never change the value's binary representation. I remain completely unexcited about optimizing that case, especially if it doesn't fit into a general framework. The bang for the buck ratio is not good: too much work, too much uglification, too little return. The return looks attractive when you actually save six hours of downtime. If I'm the only one that sees such a savings for one of his databases, though, I suppose it's not worthwhile. We'd miss optimizing these cases: numeric(8,2) - numeric(7,2) varbit(8) - varbit(7) text - xml The xml one is the most interesting to me. In the design thread, Robert also expressed interest in that one. Jim Nasby mentioned numeric generally; Jim, what kind of numeric conversions are important to you? If anyone else will miss one of those greatly, please speak up. I found that many interesting cases in this area, most notably varchar(8) - varchar(4), are safe a good deal of the time, but not provably so. So perhaps a system for automatically detecting them would be overkill, but an addition to the ALTER TABLE syntax[1] to request them would be worthwhile. nm [1] http://archives.postgresql.org/message-id/20110106042626.ga28...@tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Noah Misch n...@leadboat.com writes: On Wed, Jan 26, 2011 at 06:29:57PM -0500, Tom Lane wrote: I remain completely unexcited about optimizing that case, especially if it doesn't fit into a general framework. The bang for the buck ratio is not good: too much work, too much uglification, too little return. The return looks attractive when you actually save six hours of downtime. If I'm the only one that sees such a savings for one of his databases, though, I suppose it's not worthwhile. We'd miss optimizing these cases: numeric(8,2) - numeric(7,2) varbit(8) - varbit(7) text - xml But how often do those really come up? And do you really save that much? The table still has to be locked against other users, so you're still down, and you're still doing all the reads and computation. I don't deny that saving the writes is worth something; I just don't agree that it's worth the development and maintenance effort that such a wart is going to cost us. User-exposed features are *expensive*. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
Noah Misch n...@leadboat.com writes: text - xml BTW, that reminds me of something that I think was mentioned way back when, but absolutely fails to fit into any of the frameworks discussed here: the mere fact that a type is binary-compatible (with or without checking) doesn't make it compatible enough to not have to recreate indexes. Where and how are we going to have a wart to determine if that's necessary? And if the answer is rebuild indexes whenever the data type changes, isn't that a further big dent in the argument that it's worth avoiding a table rewrite? A text-xml replacement is going to be far from cheap anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 6:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Well, I guess my thought was that we what we were doing is extending the coercion system to be able to make decisions based on both type OID and typemod. Well, that's an interesting thought, but the proposal at hand is far more limited than that --- it's only an optimization that can be applied in limited situations. If we want to do something like what you're suggesting here, it's not going to get done for 9.1. Eh, why is this not entirely straightforward? *scratches head* -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 7:44 PM, Tom Lane t...@sss.pgh.pa.us wrote: But how often do those really come up? And do you really save that much? The table still has to be locked against other users, so you're still down, and you're still doing all the reads and computation. I don't deny that saving the writes is worth something; I just don't agree that it's worth the development and maintenance effort that such a wart is going to cost us. User-exposed features are *expensive*. I would think that text - [something that's still a varlena but with tighter validation] would be quite common. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 07:44:43PM -0500, Tom Lane wrote: numeric(8,2) - numeric(7,2) varbit(8) - varbit(7) text - xml But how often do those really come up? I'll speak from my own experience, having little idea of the larger community experience on this one. I usually don't even contemplate changes like this on nontrivial tables, because the pain of the downtime usually won't make up for getting the schema just like I want it. Cases that I can't discard on those grounds are fairly rare. As an order-of-magnitude estimate, I'd throw out one instance per DBA-annum among the DBAs whose work I witness. And do you really save that much? The table still has to be locked against other users, so you're still down, and you're still doing all the reads and computation. I don't deny that saving the writes is worth something; I just don't agree that it's worth the development and maintenance effort that such a wart is going to cost us. User-exposed features are *expensive*. If you have no indexes, you still save 50-75% of the cost by just reading and computing, not rewriting. Each index you add, even if it doesn't involve the column, pushes that advantage even further. With a typical handful of indexes, a 95+% cost savings is common enough. If we implemented ALTER TABLE ... SET DATA TYPE ... IMPLICIT, I'd agree that the marginal value of automatically detecting the above three cases would not justify the cost. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 07:52:10PM -0500, Tom Lane wrote: Noah Misch n...@leadboat.com writes: text - xml BTW, that reminds me of something that I think was mentioned way back when, but absolutely fails to fit into any of the frameworks discussed here: the mere fact that a type is binary-compatible (with or without checking) doesn't make it compatible enough to not have to recreate indexes. Where and how are we going to have a wart to determine if that's necessary? Design (section 3): http://archives.postgresql.org/message-id/20101229125625.ga27...@tornado.gateway.2wire.net Implementation: http://archives.postgresql.org/message-id/20110113230124.ga18...@tornado.gateway.2wire.net [disclaimer: I've yet to post an updated version fixing a localized bug in this patch] I ended up making no attempt to optimize indexes that have predicates or expression columns; we'll just rebuild them every time. Aside from that, I failed to find an index on built-in types that requires a rebuild after a type change optimized by this patch stack. So, the entire wart is really for the benefit of third-party types that might need it. And if the answer is rebuild indexes whenever the data type changes, isn't that a further big dent in the argument that it's worth avoiding a table rewrite? No. Rewriting the table means rebuilding *all* indexes, but the worst case for a non-rewrite type change is to rebuild all indexes that depend on the changed column. That's a large win by itself, but we'll usually do even better. A text-xml replacement is going to be far from cheap anyway. It's tough to generalize. You can certainly construct a pathological case with minimal win, but you can just as well construct the opposite. Consider a wide table with a narrow XML column. Consider a usually-NULL XML column. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Wed, Jan 26, 2011 at 09:35:24AM -0500, Robert Haas wrote: On Mon, Jan 24, 2011 at 11:13 PM, Noah Misch n...@leadboat.com wrote: This helps on conversions like varchar(4)-varchar(8) and text-xml. I've read through this patch somewhat. ?As I believe Tom also commented previously, exemptor is a pretty bad choice of name. I spent awhile with a thesaurus before coining that. ?Since Tom's comment, I've been hoping the next person to complain would at least suggest a better name. Shooting in the dark on a nomenclature issue isn't fun, and I'm not picky. CREATE CAST (source_type AS target_type) WITH FUNCTION function_name (argument_type, [, ...]) [ ANALYZE USING function_name ] [ AS ASSIGNMENT | AS IMPLICIT ] Thanks for writing about it. Seems the rest of the thread converged pretty well on a set of viable name candidates. ?I think what this patch is trying to do is tag the call to the cast function with the information that we might not need to call it, but ISTM it would be better to just notice that the function call is unnecessary and insert a RelabelType node instead. This patch does exactly that for varchar(4) - varchar(8) and similar cases. Oh, uh, err... OK, I obviously haven't understood it well enough. I'll look at it some more, but if there's anything you can write up to explain what you changed and why, that would be helpful. Based on downthread discussion, I figure this will all change a good deal. I'll still briefly explain the patch as written. Most of the patch is plumbing to support the new syntax, catalog entries, and FuncExpr field. The important changes are in parse_coerce.c. I modified find_coercion_pathway() and find_typmod_coercion_function() to retrieve pg_cast.castexemptor alongside pg_cast.castfunc. Their callers (coerce_type() and coerce_type_typmod(), respectively) then call the new apply_exemptor_function(), which calls the exemptor function, if any, returns the value to place in FuncExpr.funcexempt, and possibly updates the CoercionPathType the caller is about to use. build_coercion_expression(), unchanged except to populate FuncExpr.funcexempt, remains responsible for creating the appropriate node (RelabelType, FuncExpr). Finally, I change GetCoerceExemptions to use FuncExpr.funcexempt. I think I'm also having trouble with the fact that these patches don't apply cleanly against the master branch, because they're stacked up on each other. Maybe this will be easier once we get more of the ALTER TYPE 2 stuff in. It's certainly tricky to review a bunch of patches in parallel when they have a serial dependency chain. I'll merge the at0 and at2 bits per your request to see what we can do there. As for performance, coerce_to_target_type() is certainly in wide use, but it's not exactly a hot path. ?I prepared and ran the attached bench-coerce-worst.sql and bench-coerce-best.sql. ?Both are quite artificial. ?The first one basically asks Can we measure the new overhead at all? ?The second one asks Would any ordinary query benefit from removing a superfluous cast from an expression tree? ?The worst case had an 4% _speedup_, and the best case had a 27% speedup, suggesting answers of no and yes (but not dramatically). ?The worst-case speedup doesn't make sense, so it's probably an artifact of some recent change on master creating a larger performance dip in this pathological test case. ?I could rebase the patch and rerun the benchmark, but I doubt an exact figure would be much more meaningful. ?Unfortunately, I can't think of any more-natural tests (help welcome) that would still illustrate any performance difference. That certainly seems promising, but I don't understand how the new code can be faster on your constructed worst case. Thoughts? I hadn't merged master into my at3 branch in awhile; my best guess is that some recent change in master slowed that test case down by about 4%. I could merge up to confirm that theory. Again, though, it's an abjectly silly test case. Even if the test showed a 50% slowdown, we wouldn't have much cause for concern. Perhaps a 300% slowdown would have been notable. I'm more concerned about modularity than performance. Sticking a field into every FuncExpr so that if it happens to get passed to an ALTER TABLE statement we can pull the flag out seems really ugly to me. Understood. I agree that it's awfully specialized to be hanging around in FuncExpr. Doing it this way seemed to minimize the global quantity of ugly code, but you're right -- better to have somewhat-ugly code in tablecmds.c than to expose this in FuncExpr. Most of the benefit of the current approach will be gone when the optimization moves to eval_const_expressions(), anyway. One way or another, the next version will not do this. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Sun, Jan 9, 2011 at 5:03 PM, Noah Misch n...@leadboat.com wrote: Here I add the notion of an exemptor function, a property of a cast that determines when calls to the cast would be superfluous. Such calls can be removed, reduced to RelabelType expressions, or annotated (via a new field in FuncExpr) with the applicable exemptions. I modify various parse_coerce.c functions to retrieve, call, and act on these exemptor functions; this includes GetCoerceExemptions() from the last patch. I did opt to make find_typmod_coercion_function return COERCION_PATH_RELABELTYPE when no work is needed, rather than COERCION_PATH_NONE; this makes it consistent with find_coercion_pathway's use of that enumeration. To demonstrate the functionality, I add exemptor functions for varchar and xml. Originally I was only going to start with varchar, but xml tests the other major code path, and the exemptor function for xml is dead simple. This helps on conversions like varchar(4)-varchar(8) and text-xml. I've read through this patch somewhat. As I believe Tom also commented previously, exemptor is a pretty bad choice of name. More generally, I think this patch is a case of coding something complicated without first achieving consensus on what the behavior should be. I think we need to address that problem first, and then decide what code needs to be written, and then write the code. It seems to me that patch two in this series has the right idea: skip rewriting the table in cases where the types are binary coercible (WITHOUT FUNCTION) or one is an unconstrained domain over the other. IIUC, the problem this patch is trying to address is that our current CAST infrastructure is insufficient to identify all cases where this is true - in particular, it makes the decision solely based on the type OIDs, without consideration of the typmods. In general, varchar - varchar is not binary coercible, but varchar(4) - varchar(8) is binary coercible. I think what this patch is trying to do is tag the call to the cast function with the information that we might not need to call it, but ISTM it would be better to just notice that the function call is unnecessary and insert a RelabelType node instead. *reads archives* In fact, Tom already made pretty much exactly this suggestion, on a thread entitled Avoiding rewrite in ALTER TABLE ALTER TYPE. The only possible objection I can see to that line of attack is that it's not clear what the API should be, or whether there might be too much runtime overhead for the amount of benefit that can be obtained. This is, incidentally, another problem with this patch - if you want to make wide-ranging changes to the system like this, you need to provide a convincing argument that it's not going to compromise correctness or performance. Just submitting the patch without making an argument for why it's correct is no good, unless it's so obviously correct as to be unquestionable, which is certainly not the case here. You've got to write up some kind of submission notes so that the reviewer isn't trying to guess why you think this is the right approach, or spending a lot of time thinking through approaches you've discarded for good reason. If there is a danger of performance regression, you need to refute it, preferably with actual benchmarks. A particular aspect I'm concerned about with this patch in particular: it strikes me that the overhead of calling the exemptor functions in the ALTER TABLE case is negligible, but that this might not be true in other contexts where some of this logic may get invoked - it appears to implicate the casting machinery much more generally. I think it's a mistake to merge the handling of the rewrite-vs-noop case with the check-vs-noop case. They're fundamentally dissimilar. As noted above, being able to notice the noop case seems like it could save run-time work during ordinary query execution. But detecting the check case is useless work except in the specific case of a table rewrite. If we're going to do that at all, it seems to me that it needs to be done via a code-path that's only going to get invoked in the case of ALTER TABLE, not something that's going to happen every time we parse an expression tree. But it seems to me that it might be better to take the suggestion I made before: forget about the check-only case for now, and focus on the do-nothing case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TYPE 3: add facility to identify further no-work cases
On Mon, Jan 24, 2011 at 07:18:47PM -0500, Robert Haas wrote: On Sun, Jan 9, 2011 at 5:03 PM, Noah Misch n...@leadboat.com wrote: Here I add the notion of an exemptor function, a property of a cast that determines when calls to the cast would be superfluous. ?Such calls can be removed, reduced to RelabelType expressions, or annotated (via a new field in FuncExpr) with the applicable exemptions. ?I modify various parse_coerce.c functions to retrieve, call, and act on these exemptor functions; this includes GetCoerceExemptions() from the last patch. ?I did opt to make find_typmod_coercion_function return COERCION_PATH_RELABELTYPE when no work is needed, rather than COERCION_PATH_NONE; this makes it consistent with find_coercion_pathway's use of that enumeration. To demonstrate the functionality, I add exemptor functions for varchar and xml. Originally I was only going to start with varchar, but xml tests the other major code path, and the exemptor function for xml is dead simple. This helps on conversions like varchar(4)-varchar(8) and text-xml. I've read through this patch somewhat. As I believe Tom also commented previously, exemptor is a pretty bad choice of name. I spent awhile with a thesaurus before coining that. Since Tom's comment, I've been hoping the next person to complain would at least suggest a better name. Shooting in the dark on a nomenclature issue isn't fun, and I'm not picky. More generally, I think this patch is a case of coding something complicated without first achieving consensus on what the behavior should be. I think we need to address that problem first, and then decide what code needs to be written, and then write the code. Well, that's why I started the design thread Avoiding rewrite in ALTER TABLE ALTER TYPE before writing any code. That thread petered out rather than reach any consensus. What would you have done then? It seems to me that patch two in this series has the right idea: skip rewriting the table in cases where the types are binary coercible (WITHOUT FUNCTION) or one is an unconstrained domain over the other. IIUC, the problem this patch is trying to address is that our current CAST infrastructure is insufficient to identify all cases where this is true - in particular, it makes the decision solely based on the type OIDs, without consideration of the typmods. In general, varchar - varchar is not binary coercible, but varchar(4) - varchar(8) is binary coercible. I think what this patch is trying to do is tag the call to the cast function with the information that we might not need to call it, but ISTM it would be better to just notice that the function call is unnecessary and insert a RelabelType node instead. This patch does exactly that for varchar(4) - varchar(8) and similar cases. *reads archives* In fact, Tom already made pretty much exactly this suggestion, on a thread entitled Avoiding rewrite in ALTER TABLE ALTER TYPE. The only possible objection I can see to that line of attack is that it's not clear what the API should be, or whether there might be too much runtime overhead for the amount of benefit that can be obtained. I believe the patch does implement Tom's suggestion, at least in spirit. He mentioned expression_planner() as the place to do it. In principle, We could do it *right there* and avoid any changes to coerce_to_target_type(), but the overhead would increase. Specifically, we would be walking an already-built expression tree looking for casts to remove, probably doing a syscache lookup for every cast to grab the exemptor function OID. Doing it there would prevent us from directly helping or harming plain old queries. Since ExecRelCheck(), FormIndexDatum() and other notable paths call expression_planner(), we wouldn't want to casually add an expensive algorithm there. To avoid the extra syscache lookups, we'd piggyback off those already done in coerce_to_target_type(). That brings us to the current implementation. Better to make the entire decision in coerce_to_target_type() and never add the superfluous node to the expression tree in the first place. coerce_to_target_type() and callees seemed like the natural, low cost place to make the changes. By doing it that way, did I miss some important detail or implication of Tom's suggestion? This is, incidentally, another problem with this patch - if you want to make wide-ranging changes to the system like this, you need to provide a convincing argument that it's not going to compromise correctness or performance. Just submitting the patch without making an argument for why it's correct is no good, unless it's so obviously correct as to be unquestionable, which is certainly not the case here. You've got to write up some kind of submission notes so that the reviewer isn't trying to guess why you think this is the right approach, or spending a lot of time thinking through approaches you've