Re: JIT performance bug/regression & JIT EXPLAIN
On Mon, Jan 27, 2020 at 4:18 PM Tom Lane wrote: > Robert Haas writes: > >>> I do not think that the readability-vs-usefulness tradeoff is going > >>> to be all that good there, anyway. Certainly for testing purposes > >>> it's going to be more useful to examine portions of a structured output. > > > I intensely dislike having information that we can't show in the text > > format, or really, that we can't show in every format. > > Well, if it's relegated to a "jit = detail" option or some such, > the readability objection could be overcome. But I'm still not clear > on how you'd physically wedge it into the output, at least not in a way > that matches up with the proposal that non-text modes handle this stuff > by producing sub-nodes for the existing types of expression fields. Well, remember that the text format was the original format. The whole idea of "groups" was an anachronism that I imposed on the text format to make it possible to add other formats. It wasn't entirely natural, because the text format basically indicated nesting by indentation, and that wasn't going to work for XML or JSON. The text format also felt free to repeat elements and assume the reader would figure it out; repeating elements is OK in XML in general, but in JSON it's only OK if the surrounding context is an array rather than an object. Anyway, the point is that I (necessarily) started with whatever we had and found a way to fit it into a structure. It seems like it ought to be possible to go the other direction also, and figure out how to make the structured data look OK as text. Here's Andres's original example: "Filter": { "Expr": "(lineitem.l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone)", "JIT-Expr": "evalexpr_0_2", "JIT-Deform-Scan": "deform_0_3", "JIT-Deform-Outer": null, "JIT-Deform-Inner": null } Right now we show: Filter: (lineitem.l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone) Andres proposed: Filter: (lineitem.l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone); JIT-Expr: evalexpr_0_2, JIT-Deform-Scan: deform_0_3 That's not ideal because it's all on one line, but that could be changed: Filter: (lineitem.l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone) JIT-Expr: evalexpr_0_2 JIT-Deform-Scan: deform_0_3 I would propose either including null all the time or omitting it all the time, so that we would either change the JSON output to... "Filter": { "Expr": "(lineitem.l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone)", "JIT-Expr": "evalexpr_0_2", "JIT-Deform-Scan": "deform_0_3" } Or the text output to: Filter: (lineitem.l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone) JIT-Expr: evalexpr_0_2 JIT-Deform-Scan: deform_0_3 JIT-Deform-Outer: null JIT-Deform-Inner: null You could argue that this is inconsistent because the JSON format shows a bunch of keys that are essentially parallel, and this text format makes the Expr key essentially the primary value and the others secondary. But since the text format is for human beings, and since human beings are likely to find the Expr key to be the primary piece of information, maybe that's totally fine. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: JIT performance bug/regression & JIT EXPLAIN
On Mon, Jan 27, 2020 at 11:01 AM Robert Haas wrote: > On Fri, Nov 15, 2019 at 8:05 PM Maciek Sakrejda wrote: > > On Fri, Nov 15, 2019 at 5:49 AM Robert Haas wrote: > > > Personally, I don't care very much about backward-compatibility, or > > > about how hard it is for tools to parse. I want it to be possible, but > > > if it takes a little extra effort, so be it. > > > > I think these are two separate issues. I agree on > > backward-compatibility (especially if we can embed a server version in > > structured EXPLAIN output to make it easier for tools to track format > > differences), but not caring how hard it is for tools to parse? What's > > the point of structured formats, then? > > To make the data easy to parse. :-) > > I mean, it's clear that, on the one hand, having a format like JSON > that, as has recently been pointed out elsewhere, is parsable by a > wide variety of tools, is advantageous. However, I don't think it > really matters whether the somebody's got to look at a tag called > Flump and match it up with the data in another tag called JIT-Flump, > or whether there's a Flump group that has RegularStuff and JIT tags > inside of it. There's just not much difference in the effort involved. > Being able to parse the JSON or XML using generic code is enough of a > win that the details shouldn't matter that much. Having a structured EXPLAIN schema that's semantically consistent is still valuable. At the end of the day, it's humans who are writing the tools that consume that structured output. Given the sparse structured EXPLAIN schema documentation, as someone who currently works on EXPLAIN tooling, I'd prefer a trend toward consistency at the expense of backward compatibility. (Of course, we should avoid gratuitous changes.) But I take back the version number suggestion after reading Tom's response; that was naïve. > I think if you were going to complain about the limitations of our > current EXPLAIN output format, it'd make a lot more sense to focus on > the way we output expressions. That would be nice to have, but for what it's worth, my main complaint would be about documentation (especially around structured formats). The "Using EXPLAIN" section covers the basics, but understanding what node types exist, and what fields show up for what nodes and what they mean--that seems to be a big missing piece (I don't feel entitled to this documentation; as a structured format consumer, I'm just pointing out a deficiency). Contrast that with the great wire protocol documentation. In some ways it's easier to work on native drivers than on EXPLAIN tooling because the docs are thorough and well organized.
Re: JIT performance bug/regression & JIT EXPLAIN
Robert Haas writes: >>> I do not think that the readability-vs-usefulness tradeoff is going >>> to be all that good there, anyway. Certainly for testing purposes >>> it's going to be more useful to examine portions of a structured output. > I intensely dislike having information that we can't show in the text > format, or really, that we can't show in every format. Well, if it's relegated to a "jit = detail" option or some such, the readability objection could be overcome. But I'm still not clear on how you'd physically wedge it into the output, at least not in a way that matches up with the proposal that non-text modes handle this stuff by producing sub-nodes for the existing types of expression fields. regards, tom lane
Re: JIT performance bug/regression & JIT EXPLAIN
On Mon, Jan 27, 2020 at 12:41 PM Andres Freund wrote: > > I do not think that the readability-vs-usefulness tradeoff is going > > to be all that good there, anyway. Certainly for testing purposes > > it's going to be more useful to examine portions of a structured output. > > I think I can live with that, I don't think it's going to be a very > commonly used option. It's basically useful for regression tests, JIT > improvements, and people that want to see whether they can change their > query / schema to make better use of JIT - the latter category won't be > many, I think. I intensely dislike having information that we can't show in the text format, or really, that we can't show in every format. I might be outvoted, but I stand by that position. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: JIT performance bug/regression & JIT EXPLAIN
On Fri, Nov 15, 2019 at 8:05 PM Maciek Sakrejda wrote: > On Fri, Nov 15, 2019 at 5:49 AM Robert Haas wrote: > > Personally, I don't care very much about backward-compatibility, or > > about how hard it is for tools to parse. I want it to be possible, but > > if it takes a little extra effort, so be it. > > I think these are two separate issues. I agree on > backward-compatibility (especially if we can embed a server version in > structured EXPLAIN output to make it easier for tools to track format > differences), but not caring how hard it is for tools to parse? What's > the point of structured formats, then? To make the data easy to parse. :-) I mean, it's clear that, on the one hand, having a format like JSON that, as has recently been pointed out elsewhere, is parsable by a wide variety of tools, is advantageous. However, I don't think it really matters whether the somebody's got to look at a tag called Flump and match it up with the data in another tag called JIT-Flump, or whether there's a Flump group that has RegularStuff and JIT tags inside of it. There's just not much difference in the effort involved. Being able to parse the JSON or XML using generic code is enough of a win that the details shouldn't matter that much. I think if you were going to complain about the limitations of our current EXPLAIN output format, it'd make a lot more sense to focus on the way we output expressions. If you want to mechanically parse one of those expressions and figure out what it's doing - what functions or operators are involved, and to what they are being applied - you are probably out of luck altogether, and you are certainly not going to have an easy time of it. I'm not saying we have to solve that problem, but I believe it's a much bigger nuisance than the sort of thing we are talking about here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: JIT performance bug/regression & JIT EXPLAIN
Hi, On 2020-01-27 12:15:53 -0500, Tom Lane wrote: > Maciek Sakrejda writes: > > On Fri, Nov 15, 2019 at 5:49 AM Robert Haas wrote: > >> Personally, I don't care very much about backward-compatibility, or > >> about how hard it is for tools to parse. I want it to be possible, but > >> if it takes a little extra effort, so be it. > > > I think these are two separate issues. I agree on > > backward-compatibility (especially if we can embed a server version in > > structured EXPLAIN output to make it easier for tools to track format > > differences), but not caring how hard it is for tools to parse? What's > > the point of structured formats, then? > > I'd not been paying any attention to this thread, but Andres just > referenced it in another discussion, so I went back and read it. > Here's my two cents: > > * I agree with Robert that conditionally changing "Output" to "Project" is > an absolutely horrid idea. Yea, I think I'm convinced on that front. I never liked the idea, and the opposition has been pretty unanimous... > That will break every tool that looks at this stuff, and it just flies > in the face of the design principle that the output schema should be > stable, and it'll be a long term pain-in-the-rear for regression test > back-patching, and it will confuse users much more than it will help > them. The other idea of suppressing "Output" in cases where no > projection is happening might be all right, but only in text format > where we don't worry about schema stability. Another idea perhaps is > to emit "Output: all columns" (in text formats, less sure what to do > in structured formats). I think I like the "all columns" idea. Not what I'd do on a green field, but... If we were just dealing with the XML format, we could just add a True/False to the current a b ... and it'd make plenty sense. but for json's "Output": ["a", "b"] and yaml's Output: - "a" - "b" that's not an option as far as I can tell. Not sure what to do about that. > * In the structured formats, I think it should be okay to convert > expression-ish fields from being raw strings to being {Expression} > sub-nodes with the raw string as one field. Aside from making it easy > to inject JIT info, that would also open the door to someday showing > expressions in some more-parse-able format than a string, since other > representations could also be added as new fields. (I have a vague > recollection of wanting a list of all the Vars used in an expression, > for example.) Cool. Being extendable seems like a good direction. That's what I primarily dislike about the various work-arounds for how to associate information about JIT by a "related" name. That'd e.g. open the door to have both a normalized and an original expression in the explain output. Which would be quite valuable for some monitoring tools. > * Unfortunately that does nothing for the problem of how to show > per-expression JIT info in text format. Maybe we just shouldn't. > I do not think that the readability-vs-usefulness tradeoff is going > to be all that good there, anyway. Certainly for testing purposes > it's going to be more useful to examine portions of a structured output. I think I can live with that, I don't think it's going to be a very commonly used option. It's basically useful for regression tests, JIT improvements, and people that want to see whether they can change their query / schema to make better use of JIT - the latter category won't be many, I think. Since this is going to be a default off option anyway, I don't think we'd need to be as concerned with compatibility. But even leaving compatibility aside, it's not that clear how to best attach information in the current text format, without being confusing. Greetings, Andres Freund
Re: JIT performance bug/regression & JIT EXPLAIN
Maciek Sakrejda writes: > On Fri, Nov 15, 2019 at 5:49 AM Robert Haas wrote: >> Personally, I don't care very much about backward-compatibility, or >> about how hard it is for tools to parse. I want it to be possible, but >> if it takes a little extra effort, so be it. > I think these are two separate issues. I agree on > backward-compatibility (especially if we can embed a server version in > structured EXPLAIN output to make it easier for tools to track format > differences), but not caring how hard it is for tools to parse? What's > the point of structured formats, then? I'd not been paying any attention to this thread, but Andres just referenced it in another discussion, so I went back and read it. Here's my two cents: * I agree with Robert that conditionally changing "Output" to "Project" is an absolutely horrid idea. That will break every tool that looks at this stuff, and it just flies in the face of the design principle that the output schema should be stable, and it'll be a long term pain-in-the-rear for regression test back-patching, and it will confuse users much more than it will help them. The other idea of suppressing "Output" in cases where no projection is happening might be all right, but only in text format where we don't worry about schema stability. Another idea perhaps is to emit "Output: all columns" (in text formats, less sure what to do in structured formats). * In the structured formats, I think it should be okay to convert expression-ish fields from being raw strings to being {Expression} sub-nodes with the raw string as one field. Aside from making it easy to inject JIT info, that would also open the door to someday showing expressions in some more-parse-able format than a string, since other representations could also be added as new fields. (I have a vague recollection of wanting a list of all the Vars used in an expression, for example.) * Unfortunately that does nothing for the problem of how to show per-expression JIT info in text format. Maybe we just shouldn't. I do not think that the readability-vs-usefulness tradeoff is going to be all that good there, anyway. Certainly for testing purposes it's going to be more useful to examine portions of a structured output. * I'm not on board with the idea of adding a version number to the structured output formats. In the first place, it's too late, since we didn't leave room for one to begin with. In the second, an overall version number just isn't very helpful for this sort of problem. If a tool sees a version number higher than the latest thing it knows, what's it supposed to do, just fail? In practice it could still extract an awful lot of info, so that really isn't a desirable answer. It's better if the data structure is such that a tool can understand that some sub-part of the data is something it can't interpret, and just ignore that part. (This is more or less the same design principle that PNG image format was built on, FWIW.) Adding on fields to an existing node type easily meets that requirement, as does inventing new sub-node types, and that's all that we've done so far. But I think that replacing a scalar field value with a sub-node probably works too (at least for well-written tools), so the expression change suggested above should be OK. regards, tom lane
Re: JIT performance bug/regression & JIT EXPLAIN
On Fri, Nov 15, 2019 at 5:49 AM Robert Haas wrote: > Personally, I don't care very much about backward-compatibility, or > about how hard it is for tools to parse. I want it to be possible, but > if it takes a little extra effort, so be it. I think these are two separate issues. I agree on backward-compatibility (especially if we can embed a server version in structured EXPLAIN output to make it easier for tools to track format differences), but not caring how hard it is for tools to parse? What's the point of structured formats, then? > My main concern is having > the text output look good to human beings, because that is the primary > format and they are the primary consumers. Structured output is also for human beings, albeit indirectly. That text is the primary format may be more of a reflection of the difficulty of building and integrating EXPLAIN tools than its inherent superiority (that said, I'll concede it's a concise and elegant format for what it does). What if psql supported an EXPLAINER like it does EDITOR? For what it's worth, after thinking about this a bit, I'd like to see structured EXPLAIN evolve into a more consistent format, even if it means breaking changes (and I do think a version specifier at the root of the plan would make this easier).
Re: JIT performance bug/regression & JIT EXPLAIN
On Wed, Nov 13, 2019 at 3:03 PM Andres Freund wrote: > Well, it's not been that way since the format option was added, so ... It was pretty close in the original version, but people keep trying to be clever. > > I also think that conditionally renaming "Output" to "Project" is a > > super-bad idea. The idea of a format like this is that the "keys" stay > > constant and the values change. If you need to tell people more, you > > add more keys. > > Yea, I don't like the compat break either. But I'm not so convinced > that just continuing to collect cruft because of compatibility is worth > it - I just don't see an all that high reliance interest for explain > output. > > I think adding a new key is somewhat ok for !text, but for text that > doesn't seem like an easy solution? > > I kind of like my idea somewhere downthread, in a reply to Maciek, of > simply not listing "Output" for nodes that don't project. While that's > still a format break, it seems that tools already need to deal with > "Output" not being present? Yes, I think leaving out Output for a node that doesn't Project would be fine, as long as we're consistent about it. > > But making it always be a group doesn't necessarily seem like a bad > > idea. I think, though, that you could handle this in other ways, like > > by suffixing existing keys. e.g. if you've got Index-Qual and Filter, > > just do Index-Qual-JIT and Filter-JIT and call it good. > > Maciek suggested the same. But to me it seems going down that way will > make the format harder and harder to understand? So I think I'd rather > break compat here, and go for a group. Personally, I don't care very much about backward-compatibility, or about how hard it is for tools to parse. I want it to be possible, but if it takes a little extra effort, so be it. My main concern is having the text output look good to human beings, because that is the primary format and they are the primary consumers. > Personally I think the group naming choice for explain makes the the > !text outputs much less useful than they could be - we basically force > every tool to understand all possible keys, to make sense of formatted > output. Instead of something like 'Filter: {"Qual":{"text" : "...", > "JIT": ...}' where a tool only needed to understand that everything that > has a "Qual" inside is a filtering expression, everything that has a > "Project" is a projecting type of expression, ... a tool needs to know > about "Inner Cond", "Order By", "Filter", "Recheck Cond", "TID Cond", > "Join Filter", "Merge Cond", "Hash Cond", "One-Time Filter", ... It's not that long of a list, and I don't know of a tool that tries to do something in particular with all of those types of things anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: JIT performance bug/regression & JIT EXPLAIN
Hi, On 2019-11-13 14:29:07 -0500, Robert Haas wrote: > On Mon, Oct 28, 2019 at 7:21 PM Andres Freund wrote: > > Because that's the normal way to represent something non-existing for > > formats like json? There's a lot of information we show always for !text > > format, even if not really applicable to the context (e.g. Triggers for > > select statements). I think there's an argument to made to deviate in > > this case, but I don't think it's obvious. > > I've consistently been of the view that anyone who thinks that the > FORMAT option should affect what information gets displayed doesn't > understand the meaning of the word "format." And I still feel that > way. Well, it's not been that way since the format option was added, so ... > I also think that conditionally renaming "Output" to "Project" is a > super-bad idea. The idea of a format like this is that the "keys" stay > constant and the values change. If you need to tell people more, you > add more keys. Yea, I don't like the compat break either. But I'm not so convinced that just continuing to collect cruft because of compatibility is worth it - I just don't see an all that high reliance interest for explain output. I think adding a new key is somewhat ok for !text, but for text that doesn't seem like an easy solution? I kind of like my idea somewhere downthread, in a reply to Maciek, of simply not listing "Output" for nodes that don't project. While that's still a format break, it seems that tools already need to deal with "Output" not being present? > I also think that making the Filter field a group conditionally is a > bad idea, for similar reasons. Oh, yea, it's utterly terrible (I called it crappy in my email :)). > But making it always be a group doesn't necessarily seem like a bad > idea. I think, though, that you could handle this in other ways, like > by suffixing existing keys. e.g. if you've got Index-Qual and Filter, > just do Index-Qual-JIT and Filter-JIT and call it good. Maciek suggested the same. But to me it seems going down that way will make the format harder and harder to understand? So I think I'd rather break compat here, and go for a group. Personally I think the group naming choice for explain makes the the !text outputs much less useful than they could be - we basically force every tool to understand all possible keys, to make sense of formatted output. Instead of something like 'Filter: {"Qual":{"text" : "...", "JIT": ...}' where a tool only needed to understand that everything that has a "Qual" inside is a filtering expression, everything that has a "Project" is a projecting type of expression, ... a tool needs to know about "Inner Cond", "Order By", "Filter", "Recheck Cond", "TID Cond", "Join Filter", "Merge Cond", "Hash Cond", "One-Time Filter", ... Greetings, Andres Freund
Re: JIT performance bug/regression & JIT EXPLAIN
On Mon, Oct 28, 2019 at 7:21 PM Andres Freund wrote: > Because that's the normal way to represent something non-existing for > formats like json? There's a lot of information we show always for !text > format, even if not really applicable to the context (e.g. Triggers for > select statements). I think there's an argument to made to deviate in > this case, but I don't think it's obvious. I've consistently been of the view that anyone who thinks that the FORMAT option should affect what information gets displayed doesn't understand the meaning of the word "format." And I still feel that way. I also think that conditionally renaming "Output" to "Project" is a super-bad idea. The idea of a format like this is that the "keys" stay constant and the values change. If you need to tell people more, you add more keys. I also think that making the Filter field a group conditionally is a bad idea, for similar reasons. But making it always be a group doesn't necessarily seem like a bad idea. I think, though, that you could handle this in other ways, like by suffixing existing keys. e.g. if you've got Index-Qual and Filter, just do Index-Qual-JIT and Filter-JIT and call it good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: JIT performance bug/regression & JIT EXPLAIN
Hi, On 2019-11-12 13:42:10 -0800, Maciek Sakrejda wrote: > On Mon, Oct 28, 2019 at 5:02 PM Andres Freund wrote: > > What I dislike about that is that it basically again is introducing > > "again"? Am I missing some history here? I'd love to read up on this > if there are mistakes to learn from. I think I was mostly referring to mistakes we've made for the json etc key names. By e.g. having expressions as "Function Call", "Table Function Call", "Filter", "TID Cond", ... a tool that wants to interpret the output needs awareness of all of these different names, rather than knowing that everything with a sub-group "Expression" has to be an expression. I.e. instead of "Plan": { "Node Type": "Seq Scan", "Parallel Aware": false, "Relation Name": "pg_class", "Schema": "pg_catalog", "Alias": "pg_class", "Startup Cost": 0.00, "Total Cost": 17.82, "Plan Rows": 385, "Plan Width": 68, "Output": ["relname", "tableoid"], "Filter": "(pg_class.relname <> 'foo'::name)" } we ought to have gone for "Plan": { "Node Type": "Seq Scan", "Parallel Aware": false, "Relation Name": "pg_class", "Schema": "pg_catalog", "Alias": "pg_class", "Startup Cost": 0.00, "Total Cost": 17.82, "Plan Rows": 385, "Plan Width": 68, "Output": ["relname", "tableoid"], "Filter": {"Expression" : { "text": (pg_class.relname <> 'foo'::name)"}} } or something like that. Which'd then make it obvious how to add information about JIT to each expression. Whereas the proposal of the separate key name perpetuates the messiness... > > something that requires either pattern matching on key names (i.e. a key > > of '(.*) JIT' is one that has information about JIT, and the associated > > expresssion is in key $1), or knowing all the potential keys an > > expression could be in. > > That still seems less awkward than having to handle a Filter field > that's either scalar or a group. Yea, it's a sucky option :( > Most current EXPLAIN options just add > additional fields to the structured plan instead of modifying it, no? > If that output is better enough, though, maybe we should just always > make Filter a group and go with the breaking change? If tooling > authors need to treat this case specially anyway, might as well evolve > the format. Yea, maybe that's the right thing to do. Would be nice to have some more input... Greetings, Andres Freund
Re: JIT performance bug/regression & JIT EXPLAIN
On Mon, Oct 28, 2019 at 5:02 PM Andres Freund wrote: > What I dislike about that is that it basically again is introducing "again"? Am I missing some history here? I'd love to read up on this if there are mistakes to learn from. > something that requires either pattern matching on key names (i.e. a key > of '(.*) JIT' is one that has information about JIT, and the associated > expresssion is in key $1), or knowing all the potential keys an > expression could be in. That still seems less awkward than having to handle a Filter field that's either scalar or a group. Most current EXPLAIN options just add additional fields to the structured plan instead of modifying it, no? If that output is better enough, though, maybe we should just always make Filter a group and go with the breaking change? If tooling authors need to treat this case specially anyway, might as well evolve the format. > Another alternative would be to just remove the 'Output' line when a > node doesn't project - it can't really carry meaning in those cases > anyway? ¯\_(ツ)_/¯ For what it's worth, I certainly wouldn't miss it.
Re: JIT performance bug/regression & JIT EXPLAIN
Hi, On 2019-10-28 15:05:01 -0400, Robert Haas wrote: > On Fri, Sep 27, 2019 at 3:21 AM Andres Freund wrote: > > - JIT-Expr: whether the expression was JIT compiled (might e.g. not be > > the case because no parent was provided) > > - JIT-Deform-{Scan,Outer,Inner}: wether necessary, and whether JIT > > accelerated. > > > > I don't like these names much, but ... > > > > For the deform cases I chose to display > > a) the function name if JIT compiled > > b) "false" if the expression is JIT compiled, deforming is > >necessary, but deforming is not JIT compiled (e.g. because the slot type > >wasn't fixed) > > c) "null" if not necessary, with that being omitted in text mode. > > I mean, why not just omit in all modes if it's not necessary? I don't > see that making the information we produce randomly inconsistent > between modes is buying us anything. Because that's the normal way to represent something non-existing for formats like json? There's a lot of information we show always for !text format, even if not really applicable to the context (e.g. Triggers for select statements). I think there's an argument to made to deviate in this case, but I don't think it's obvious. Abstract formatting reasons aside, it's actually useful to see where we know we're dealing with tuples that don't need to be deformed and thus overhead due to that cannot be relevant. Not sure if there's sufficient consumers for that, but ... We e.g. should verify that the "none" doesn't suddenly vanish, because we broke the information that let us infer that we don't need tuple deforming - and that's easier to understand if there's an explicit field, rather than reasining from absence. IMO. Greetings, Andres Freund
Re: JIT performance bug/regression & JIT EXPLAIN
On Fri, Sep 27, 2019 at 3:21 AM Andres Freund wrote: > - JIT-Expr: whether the expression was JIT compiled (might e.g. not be > the case because no parent was provided) > - JIT-Deform-{Scan,Outer,Inner}: wether necessary, and whether JIT > accelerated. > > I don't like these names much, but ... > > For the deform cases I chose to display > a) the function name if JIT compiled > b) "false" if the expression is JIT compiled, deforming is >necessary, but deforming is not JIT compiled (e.g. because the slot type >wasn't fixed) > c) "null" if not necessary, with that being omitted in text mode. I mean, why not just omit in all modes if it's not necessary? I don't see that making the information we produce randomly inconsistent between modes is buying us anything. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: JIT performance bug/regression & JIT EXPLAIN
>But that's pretty crappy, because it means that the 'shape' of the output >depends on the jit_details option. Yeah, that would be hard to work with. What about adding it as a sibling group? "Filter": "(lineitem.l_shipdate <= '1998-09-18 00:00:00'::timestamp without time zone)", "Filter JIT": { "Expr": "evalexpr_0_2", "Deform Scan": "deform_0_3", "Deform Outer": null, "Deform Inner": null } Also not that pretty, but at least it's easier to work with (I also changed the dashes to spaces since that's what the rest of EXPLAIN is doing as a matter of style). >But the compat break due to that change is not small- perhaps we could instead >mark that in another way? We could add a "Projects" boolean key instead? Of course that's more awkward in text mode. Maybe compat break is less of an issue in text mode and we can treat this differently? >I'm not sure that 'TRANS' is the best placeholder for the transition value >here. Maybe $TRANS would be clearer? +1, I think the `$` makes it clearer that this is not a literal expression. >For HashJoin/Hash I've added 'Outer Hash Key' and 'Hash Key' for each key, but >only in verbose mode. That reads pretty well to me. What does the structured output look like?
Re: JIT performance bug/regression & JIT EXPLAIN
Hi, On 2019-09-27 00:20:53 -0700, Andres Freund wrote: > Unfortunately I found a performance regression for JITed query > compilation introduced in 12, compared to 11. Fixed in one of the > attached patches > (v1-0009-Fix-determination-when-tuple-deforming-can-be-JIT.patch > - which needs a better commit message). > > The first question is when to push that fix. I'm inclined to just do so > now - as we still do JITed tuple deforming in most cases, as well as > doing so in 11 in the places this patch fixes, the risk of that seems > low. But I can also see an arguments for waiting after 12.0. Since nobody opined, I now have pushed that, and the other fix mentioned later in that email. I'd appreciate comments on the rest of the email, it's clear that we need to improve the test infrastructure here. And also the explain output for grouping sets... Greetings, Andres Freund