Re: Detoasting optionally to make Explain-Analyze less misleading
> I'm not sure there's a need for it. When a query runs under > auto_explain, the output values will be sent to the client, > so those cycles should be accounted for anyway, no? > Yes, great point, the total duration reported by auto_explain includes it. Explicit serialization stats might still be helpful for folks when it is the bottleneck, but less useful for sure (especially if nothing else causes big discrepancies between the duration reported by auto_explain and the "actual total time" of the root node). (Perhaps the auto_explain documentation should mention this?) > I'd value this. I notice the folks working on the other new explain parameter (memory) opted to add a comment to the auto_explain source code to say it wasn't supported. Thanks again, Michael
Re: Detoasting optionally to make Explain-Analyze less misleading
Michael Christofides writes: > While looking to add support for SERIALIZE in an explain visualisation tool > I work on, I realised there isn't yet an equivalent auto_explain parameter > for SERIALIZE. I'm not sure if this is a deliberate omission (perhaps for a > similar reason planning time is not included in auto_explain?), but I > didn't see it mentioned above, so I thought best to ask in case not. I'm not sure there's a need for it. When a query runs under auto_explain, the output values will be sent to the client, so those cycles should be accounted for anyway, no? (Perhaps the auto_explain documentation should mention this?) regards, tom lane
Re: Detoasting optionally to make Explain-Analyze less misleading
> > So thanks again! and this will really help a lot of people. I'd like to echo this thanks to you all. While looking to add support for SERIALIZE in an explain visualisation tool I work on, I realised there isn't yet an equivalent auto_explain parameter for SERIALIZE. I'm not sure if this is a deliberate omission (perhaps for a similar reason planning time is not included in auto_explain?), but I didn't see it mentioned above, so I thought best to ask in case not. Thanks again, Michael
Re: Detoasting optionally to make Explain-Analyze less misleading
First of all thanks for bringing this Feature to PostgreSQL. From a regular-user perspective (not everyone is a Pro) it is very misleading that ANALYZE doesn't do what it suggests it does. To run the query into some kind of /dev/null type of destination is feasible and that is what people end up doing after they have fallen into the "de-toasting" trap. Having SERIALIZE is a great improvement for certain. When I said that SERIALIZE should be the default, then this came mostly out of very surprising subjective experiences in the past. Turning it on certainly alters some existing benchmarks and timings. That is destructive in a way and would destroy some existing work and measures. I lack the overall understanding of the consequences, so please don't follow this (emotional) advice. So thanks again! and this will really help a lot of people. The people actually bothering with EXPLAIN options are likely to explore the documentation and now have a hint about this pitfall. The EXPLAIN part of PostgreSQL "feels" a lot better now. I appreciate all of your work on this issue, which came up without being on some kind of plan and of course for the overall work on PostgreSQL. /Stepan On 4/10/24 15:57, Tom Lane wrote: Matthias van de Meent writes: Upthread at [0], Stepan mentioned that we should default to SERIALIZE when ANALYZE is enabled. I suspect a patch in that direction would primarily contain updates in the test plan outputs, but I've not yet worked on that. Does anyone else have a strong opinion for or against adding SERIALIZE to the default set of explain features enabled with ANALYZE? I am 100% dead set against that, because it would silently render EXPLAIN outputs from different versions quite non-comparable. regards, tom lane
Re: Detoasting optionally to make Explain-Analyze less misleading
Matthias van de Meent writes: > Upthread at [0], Stepan mentioned that we should default to SERIALIZE > when ANALYZE is enabled. I suspect a patch in that direction would > primarily contain updates in the test plan outputs, but I've not yet > worked on that. > Does anyone else have a strong opinion for or against adding SERIALIZE > to the default set of explain features enabled with ANALYZE? I am 100% dead set against that, because it would silently render EXPLAIN outputs from different versions quite non-comparable. regards, tom lane
Re: Detoasting optionally to make Explain-Analyze less misleading
On Wed, 3 Apr 2024 at 23:50, Tom Lane wrote: > I've pushed this after a good deal of cosmetic polishing -- for > example, I spent some effort on making serializeAnalyzeReceive > look as much like printtup as possible, in hopes of making it > easier to keep the two functions in sync in future. Upthread at [0], Stepan mentioned that we should default to SERIALIZE when ANALYZE is enabled. I suspect a patch in that direction would primarily contain updates in the test plan outputs, but I've not yet worked on that. Does anyone else have a strong opinion for or against adding SERIALIZE to the default set of explain features enabled with ANALYZE? I'll add this to "Decisions to Recheck Mid-Beta"-section if nobody has a clear objection. Kind regards, Matthias van de Meent Neon (https://neon.tech) [0] https://postgr.es/m/ea885631-21f1-425a-97ed-c4bfb8cf9c63%40gmx.de
Re: Detoasting optionally to make Explain-Analyze less misleading
On Wed, 3 Apr 2024 at 23:50, Tom Lane wrote: > > Matthias van de Meent writes: >> On Tue, 2 Apr 2024 at 17:47, Tom Lane wrote: >>> IIUC, it's not possible to use the SERIALIZE option when explaining >>> CREATE TABLE AS, because you can't install the instrumentation tuple >>> receiver when the IntoRel one is needed. I think that's fine because >>> no serialization would happen in that case anyway, but should we >>> throw an error if that combination is requested? Blindly reporting >>> that zero bytes were serialized seems like it'd confuse people. > >> I think it's easily explained as no rows were transfered to the >> client. If there is actual confusion, we can document it, but >> confusing disk with network is not a case I'd protect against. See >> also: EXPLAIN (ANALYZE, SERIALIZE) INSERT without the RETURNING >> clause. > > Fair enough. There were a couple of spots in the code where I thought > this was important to comment about, though. Yeah, I'll agree with that. >>> However, isn't the bottom half of serializeAnalyzeStartup doing >>> exactly what the comment above it says we don't do, that is accounting >>> for the RowDescription message? Frankly I agree with the comment that >>> it's not worth troubling over, so I'd just drop that code rather than >>> finding a solution for the magic-number problem. > >> I'm not sure I agree with not including the size of RowDescription >> itself though, as wide results can have a very large RowDescription >> overhead; up to several times the returned data in cases where few >> rows are returned. > > Meh --- if we're rounding off to kilobytes, you're unlikely to see it. > In any case, if we start counting overhead messages, where shall we > stop? Should we count the eventual CommandComplete or ReadyForQuery, > for instance? I'm content to say that this measures data only; that > seems to jibe with other aspects of EXPLAIN's behavior. Fine with me. > > Removed. I've instead added buffer usage, as I realised that wasn't > > covered yet, and quite important to detect excessive detoasting (it's > > not covered at the top-level scan). > > Duh, good catch. > > I've pushed this after a good deal of cosmetic polishing -- for > example, I spent some effort on making serializeAnalyzeReceive > look as much like printtup as possible, in hopes of making it > easier to keep the two functions in sync in future. Thanks for the review, and for pushing! -Matthias
Re: Detoasting optionally to make Explain-Analyze less misleading
Matthias van de Meent writes: > On Tue, 2 Apr 2024 at 17:47, Tom Lane wrote: >> IIUC, it's not possible to use the SERIALIZE option when explaining >> CREATE TABLE AS, because you can't install the instrumentation tuple >> receiver when the IntoRel one is needed. I think that's fine because >> no serialization would happen in that case anyway, but should we >> throw an error if that combination is requested? Blindly reporting >> that zero bytes were serialized seems like it'd confuse people. > I think it's easily explained as no rows were transfered to the > client. If there is actual confusion, we can document it, but > confusing disk with network is not a case I'd protect against. See > also: EXPLAIN (ANALYZE, SERIALIZE) INSERT without the RETURNING > clause. Fair enough. There were a couple of spots in the code where I thought this was important to comment about, though. >> However, isn't the bottom half of serializeAnalyzeStartup doing >> exactly what the comment above it says we don't do, that is accounting >> for the RowDescription message? Frankly I agree with the comment that >> it's not worth troubling over, so I'd just drop that code rather than >> finding a solution for the magic-number problem. > I'm not sure I agree with not including the size of RowDescription > itself though, as wide results can have a very large RowDescription > overhead; up to several times the returned data in cases where few > rows are returned. Meh --- if we're rounding off to kilobytes, you're unlikely to see it. In any case, if we start counting overhead messages, where shall we stop? Should we count the eventual CommandComplete or ReadyForQuery, for instance? I'm content to say that this measures data only; that seems to jibe with other aspects of EXPLAIN's behavior. > Removed. I've instead added buffer usage, as I realised that wasn't > covered yet, and quite important to detect excessive detoasting (it's > not covered at the top-level scan). Duh, good catch. I've pushed this after a good deal of cosmetic polishing -- for example, I spent some effort on making serializeAnalyzeReceive look as much like printtup as possible, in hopes of making it easier to keep the two functions in sync in future. regards, tom lane
Re: Detoasting optionally to make Explain-Analyze less misleading
On Tue, 2 Apr 2024 at 17:47, Tom Lane wrote: > > Matthias van de Meent writes: > > Attached is v9, which is rebased on master to handle 24eebc65's > > changed signature of pq_sendcountedtext. > > It now also includes autocompletion, and a second patch which adds > > documentation to give users insights into this new addition to > > EXPLAIN. > > I took a quick look through this. Some comments in no particular > order: Thanks! > Documentation is not optional, so I don't really see the point of > splitting this into two patches. I've seen the inverse several times, but I've merged them in the attached version 10. > IIUC, it's not possible to use the SERIALIZE option when explaining > CREATE TABLE AS, because you can't install the instrumentation tuple > receiver when the IntoRel one is needed. I think that's fine because > no serialization would happen in that case anyway, but should we > throw an error if that combination is requested? Blindly reporting > that zero bytes were serialized seems like it'd confuse people. I think it's easily explained as no rows were transfered to the client. If there is actual confusion, we can document it, but confusing disk with network is not a case I'd protect against. See also: EXPLAIN (ANALYZE, SERIALIZE) INSERT without the RETURNING clause. > I'd lose the stuff about measuring memory consumption. Nobody asked > for that and the total is completely misleading, because in reality > we'll reclaim the memory used after each row. It would allow cutting > the text-mode output down to one line, too, instead of having your > own format that's not like anything else. Done. > I thought the upthread agreement was to display the amount of > data sent rounded to kilobytes, so why is the code displaying > an exact byte count? Probably it was because the other explain code I referenced was using bytes in the json/yaml format. Fixed. > I don't especially care for magic numbers like these: > > + /* see printtup.h why we add 18 bytes here. These are the > infos > +* needed for each attribute plus the attribute's name */ > + receiver->metrics.bytesSent += (int64) namelen + 1 + 18; > > If the protocol is ever changed in a way that invalidates this, > there's about 0 chance that somebody would remember to touch > this code. > However, isn't the bottom half of serializeAnalyzeStartup doing > exactly what the comment above it says we don't do, that is accounting > for the RowDescription message? Frankly I agree with the comment that > it's not worth troubling over, so I'd just drop that code rather than > finding a solution for the magic-number problem. In the comment above I intended to explain that it takes negligible time to serialize the RowDescription message (when compared to all other tasks of explain), so skipping the actual writing of the message would be fine. I'm not sure I agree with not including the size of RowDescription itself though, as wide results can have a very large RowDescription overhead; up to several times the returned data in cases where few rows are returned. Either way, I've removed that part of the code. > Don't bother with duplicating valgrind-related logic in > serializeAnalyzeReceive --- that's irrelevant to actual users. Removed. I've instead added buffer usage, as I realised that wasn't covered yet, and quite important to detect excessive detoasting (it's not covered at the top-level scan). > This seems like cowboy coding: > > + self->destRecevier.mydest = DestNone; > > You should define a new value of the CommandDest enum and > integrate this receiver type into the support functions > in dest.c. Done. > BTW, "destRecevier" is misspelled... Thanks, fixed. Kind regards, Matthias van de Meent. v10-0001-Explain-Add-SERIALIZE-option.patch Description: Binary data
Re: Detoasting optionally to make Explain-Analyze less misleading
Matthias van de Meent writes: > Attached is v9, which is rebased on master to handle 24eebc65's > changed signature of pq_sendcountedtext. > It now also includes autocompletion, and a second patch which adds > documentation to give users insights into this new addition to > EXPLAIN. I took a quick look through this. Some comments in no particular order: Documentation is not optional, so I don't really see the point of splitting this into two patches. IIUC, it's not possible to use the SERIALIZE option when explaining CREATE TABLE AS, because you can't install the instrumentation tuple receiver when the IntoRel one is needed. I think that's fine because no serialization would happen in that case anyway, but should we throw an error if that combination is requested? Blindly reporting that zero bytes were serialized seems like it'd confuse people. I'd lose the stuff about measuring memory consumption. Nobody asked for that and the total is completely misleading, because in reality we'll reclaim the memory used after each row. It would allow cutting the text-mode output down to one line, too, instead of having your own format that's not like anything else. I thought the upthread agreement was to display the amount of data sent rounded to kilobytes, so why is the code displaying an exact byte count? I don't especially care for magic numbers like these: + /* see printtup.h why we add 18 bytes here. These are the infos +* needed for each attribute plus the attribute's name */ + receiver->metrics.bytesSent += (int64) namelen + 1 + 18; If the protocol is ever changed in a way that invalidates this, there's about 0 chance that somebody would remember to touch this code. However, isn't the bottom half of serializeAnalyzeStartup doing exactly what the comment above it says we don't do, that is accounting for the RowDescription message? Frankly I agree with the comment that it's not worth troubling over, so I'd just drop that code rather than finding a solution for the magic-number problem. Don't bother with duplicating valgrind-related logic in serializeAnalyzeReceive --- that's irrelevant to actual users. This seems like cowboy coding: + self->destRecevier.mydest = DestNone; You should define a new value of the CommandDest enum and integrate this receiver type into the support functions in dest.c. BTW, "destRecevier" is misspelled... regards, tom lane
Re: Detoasting optionally to make Explain-Analyze less misleading
On Mon, 26 Feb 2024 at 21:54, stepan rutz wrote: > > Hi Matthias, thanks for picking it up. I still believe this is valuable > to a lot of people out there. Thanks for dealing with my proposal. > Matthias, Tom, Tomas everyone. > > Two (more or less) controversial remarks from side. > > 1. Actually serialization should be the default for "analyze" in > explain, as current analyze doesn't detoast and thus distorts the result > in extreme (but common) cases easily by many order of magnitude (see my > original video on that one). So current "explain analyze" only works for > some queries and since detoasting is really transparent, it is quite > something to leave detoasting out of explain analyze. This surprises > people all the time, since explain analyze suggests it "runs" the query > more realistically. I'm not sure about this, but it could easily be a mid-beta decision (if this is introduced before the feature freeze of 17, whenever that is). > 2. The bandwidth I computed in one of the previous versions of the patch > was certainly cluttering up the explain output and it is misleading yes, > but then again it performs a calculation people will now do in their > head. The "bandwidth" here is how much data your query gets out of > backend by means of the query and the deserialization. So of course if > you do id-lookups you get a single row and such querries do have a lower > data-retrieval bandwidth compared to bulk querries. I think that's a job for post-processing the EXPLAIN output by the user. If we don't give users the raw data, they won't be able to do their own cross-referenced processing: "5MB/sec" doesn't tell you how much time or data was actually spent. > However having some > measure of how fast data is delivered from the backend especially on > larger joins is still a good indicator of one aspect of a query. I'm not sure about that. Network speed is a big limiting factor that we can't measure here, and the size on disk is often going to be smaller than the data size when transfered across the network. > Sorry for the remarks. Both are not really important, just restating my > points here. I understand the objections and reasons that speak against > both points and believe the current scope is just right. No problem. Remarks from users (when built on solid arguments) are always welcome, even if we may not always agree on the specifics. -->8-- Attached is v9, which is rebased on master to handle 24eebc65's changed signature of pq_sendcountedtext. It now also includes autocompletion, and a second patch which adds documentation to give users insights into this new addition to EXPLAIN. Kind regards, Matthias van de Meent v9-0002-Add-EXPLAIN-SERIALIZE-docs.patch Description: Binary data v9-0001-Explain-Add-SERIALIZE-option.patch Description: Binary data
Re: Detoasting optionally to make Explain-Analyze less misleading
Hi Matthias, thanks for picking it up. I still believe this is valuable to a lot of people out there. Thanks for dealing with my proposal. Matthias, Tom, Tomas everyone. Two (more or less) controversial remarks from side. 1. Actually serialization should be the default for "analyze" in explain, as current analyze doesn't detoast and thus distorts the result in extreme (but common) cases easily by many order of magnitude (see my original video on that one). So current "explain analyze" only works for some queries and since detoasting is really transparent, it is quite something to leave detoasting out of explain analyze. This surprises people all the time, since explain analyze suggests it "runs" the query more realistically. 2. The bandwidth I computed in one of the previous versions of the patch was certainly cluttering up the explain output and it is misleading yes, but then again it performs a calculation people will now do in their head. The "bandwidth" here is how much data your query gets out of backend by means of the query and the deserialization. So of course if you do id-lookups you get a single row and such querries do have a lower data-retrieval bandwidth compared to bulk querries. However having some measure of how fast data is delivered from the backend especially on larger joins is still a good indicator of one aspect of a query. Sorry for the remarks. Both are not really important, just restating my points here. I understand the objections and reasons that speak against both points and believe the current scope is just right. /Stepan On 26.02.24 20:30, Matthias van de Meent wrote: Hi, I've taken the liberty to update this patch, and register it in the commitfest app to not lose track of progress [0]. The attached v8 patch measures scratch memory allocations (with MEMORY option), total time spent in serialization (with TIMING on, measures are inclusive of unseparated memcpy to the message buffer), and a count of produced bytes plus the output format used (text or binary). It's a light rework of the earlier 0007 patch, I've reused tests and some infrastructure, while the implementation details and comments have been updated significantly. I think we can bikeshed on format and names, but overall I think the patch is in a very decent shape. Stepan, thank you for your earlier work, and feel free to check it out or pick it up again if you want to; else I'll try to get this done. Kind regards, Matthias van de Meent [0] https://commitfest.postgresql.org/47/4852/
Re: Detoasting optionally to make Explain-Analyze less misleading
Hi, I've taken the liberty to update this patch, and register it in the commitfest app to not lose track of progress [0]. The attached v8 patch measures scratch memory allocations (with MEMORY option), total time spent in serialization (with TIMING on, measures are inclusive of unseparated memcpy to the message buffer), and a count of produced bytes plus the output format used (text or binary). It's a light rework of the earlier 0007 patch, I've reused tests and some infrastructure, while the implementation details and comments have been updated significantly. I think we can bikeshed on format and names, but overall I think the patch is in a very decent shape. Stepan, thank you for your earlier work, and feel free to check it out or pick it up again if you want to; else I'll try to get this done. Kind regards, Matthias van de Meent [0] https://commitfest.postgresql.org/47/4852/ v8-0001-Explain-Add-SERIALIZE-option.patch Description: Binary data
Re: Detoasting optionally to make Explain-Analyze less misleading
On 11/2/23 22:33, Matthias van de Meent wrote: > On Thu, 2 Nov 2023 at 22:25, Tomas Vondra > wrote: >> >> >> >> On 11/2/23 21:02, Matthias van de Meent wrote: >>> On Thu, 2 Nov 2023 at 20:32, Tomas Vondra >>> wrote: On 11/2/23 20:09, stepan rutz wrote: > db1=# explain (analyze, serialize) select * from test; > QUERY PLAN > --- > Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual > time=0.023..0.027 rows=1 loops=1) > Planning Time: 0.077 ms > Execution Time: 303.281 ms > Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec [...] BTW if you really want to print amount of memory, maybe print it in kilobytes, like every other place in explain.c? >>> >>> Isn't node width in bytes, or is it an opaque value not to be >>> interpreted by users? I've never really investigated that part of >>> Postgres' explain output... >>> >> >> Right, "width=" is always in bytes. But fields like amount of sorted >> data is in kB, and this seems closer to that. >> Also, explain generally prints stuff in "key: value" style (in text format). >>> >>> That'd be key: metrickey=metricvalue for expanded values like those in >>> plan nodes and the buffer usage, no? >>> >> >> Possibly. But the proposed output does neither. Also, it starts with >> "Serialized Bytes" but then prints info about bandwidth. >> >> > Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec >>> >>> I was thinking more along the lines of something like this: >>> >>> [...] >>> Execution Time: xxx ms >>> Serialization: time=yyy.yyy (in ms) size=yyy (in KiB, or B) mode=text >>> (or binary) This is significantly different from your output, as it doesn't hide >>> the measured time behind a lossy calculation of bandwidth, but gives >>> the measured data to the user; allowing them to derive their own >>> precise bandwidth if they're so inclined. >>> >> >> Might work. I'm still not convinced we need to include the mode, or that >> the size is that interesting/useful, though. > > I'd say size is interesting for systems where network bandwidth is > constrained, but CPU isn't. We currently only show estimated widths & > accurate number of tuples returned, but that's not an accurate > explanation of why your 30-row 3GB resultset took 1h to transmit on a > 10mbit line - that is only explained by the bandwidth of your > connection and the size of the dataset. As we can measure the size of > the returned serialized dataset here, I think it's in the interest of > any debugability to also present it to the user. Sadly, we don't have > good measures of bandwidth without sending that data across, so that's > the only metric that we can't show here, but total query data size is > definitely something that I'd be interested in here. Yeah, I agree with that. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Detoasting optionally to make Explain-Analyze less misleading
On Thu, 2 Nov 2023 at 22:25, Tomas Vondra wrote: > > > > On 11/2/23 21:02, Matthias van de Meent wrote: > > On Thu, 2 Nov 2023 at 20:32, Tomas Vondra > > wrote: > >> On 11/2/23 20:09, stepan rutz wrote: > >>> db1=# explain (analyze, serialize) select * from test; > >>> QUERY PLAN > >>> --- > >>> Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual > >>> time=0.023..0.027 rows=1 loops=1) > >>> Planning Time: 0.077 ms > >>> Execution Time: 303.281 ms > >>> Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec > >> [...] > >> BTW if you really want to print amount of memory, maybe print it in > >> kilobytes, like every other place in explain.c? > > > > Isn't node width in bytes, or is it an opaque value not to be > > interpreted by users? I've never really investigated that part of > > Postgres' explain output... > > > > Right, "width=" is always in bytes. But fields like amount of sorted > data is in kB, and this seems closer to that. > > >> Also, explain generally > >> prints stuff in "key: value" style (in text format). > > > > That'd be key: metrickey=metricvalue for expanded values like those in > > plan nodes and the buffer usage, no? > > > > Possibly. But the proposed output does neither. Also, it starts with > "Serialized Bytes" but then prints info about bandwidth. > > > >>> Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec > > > > I was thinking more along the lines of something like this: > > > > [...] > > Execution Time: xxx ms > > Serialization: time=yyy.yyy (in ms) size=yyy (in KiB, or B) mode=text > > (or binary) > > > This is significantly different from your output, as it doesn't hide > > the measured time behind a lossy calculation of bandwidth, but gives > > the measured data to the user; allowing them to derive their own > > precise bandwidth if they're so inclined. > > > > Might work. I'm still not convinced we need to include the mode, or that > the size is that interesting/useful, though. I'd say size is interesting for systems where network bandwidth is constrained, but CPU isn't. We currently only show estimated widths & accurate number of tuples returned, but that's not an accurate explanation of why your 30-row 3GB resultset took 1h to transmit on a 10mbit line - that is only explained by the bandwidth of your connection and the size of the dataset. As we can measure the size of the returned serialized dataset here, I think it's in the interest of any debugability to also present it to the user. Sadly, we don't have good measures of bandwidth without sending that data across, so that's the only metric that we can't show here, but total query data size is definitely something that I'd be interested in here. Kind regards, Matthias van de Meent Neon (https://neon.tech)
Re: Detoasting optionally to make Explain-Analyze less misleading
On 11/2/23 21:02, Matthias van de Meent wrote: > On Thu, 2 Nov 2023 at 20:32, Tomas Vondra > wrote: >> On 11/2/23 20:09, stepan rutz wrote: >>> db1=# explain (analyze, serialize) select * from test; >>> QUERY PLAN >>> --- >>> Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual >>> time=0.023..0.027 rows=1 loops=1) >>> Planning Time: 0.077 ms >>> Execution Time: 303.281 ms >>> Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec >> [...] >> BTW if you really want to print amount of memory, maybe print it in >> kilobytes, like every other place in explain.c? > > Isn't node width in bytes, or is it an opaque value not to be > interpreted by users? I've never really investigated that part of > Postgres' explain output... > Right, "width=" is always in bytes. But fields like amount of sorted data is in kB, and this seems closer to that. >> Also, explain generally >> prints stuff in "key: value" style (in text format). > > That'd be key: metrickey=metricvalue for expanded values like those in > plan nodes and the buffer usage, no? > Possibly. But the proposed output does neither. Also, it starts with "Serialized Bytes" but then prints info about bandwidth. >>> Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec > > I was thinking more along the lines of something like this: > > [...] > Execution Time: xxx ms > Serialization: time=yyy.yyy (in ms) size=yyy (in KiB, or B) mode=text > (or binary) > > This is significantly different from your output, as it doesn't hide > the measured time behind a lossy calculation of bandwidth, but gives > the measured data to the user; allowing them to derive their own > precise bandwidth if they're so inclined. > Might work. I'm still not convinced we need to include the mode, or that the size is that interesting/useful, though. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Detoasting optionally to make Explain-Analyze less misleading
On Thu, 2 Nov 2023 at 20:32, Tomas Vondra wrote: > On 11/2/23 20:09, stepan rutz wrote: > > db1=# explain (analyze, serialize) select * from test; > > QUERY PLAN > > --- > > Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual > > time=0.023..0.027 rows=1 loops=1) > > Planning Time: 0.077 ms > > Execution Time: 303.281 ms > > Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec > [...] > BTW if you really want to print amount of memory, maybe print it in > kilobytes, like every other place in explain.c? Isn't node width in bytes, or is it an opaque value not to be interpreted by users? I've never really investigated that part of Postgres' explain output... > Also, explain generally > prints stuff in "key: value" style (in text format). That'd be key: metrickey=metricvalue for expanded values like those in plan nodes and the buffer usage, no? > > Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec I was thinking more along the lines of something like this: [...] Execution Time: xxx ms Serialization: time=yyy.yyy (in ms) size=yyy (in KiB, or B) mode=text (or binary) This is significantly different from your output, as it doesn't hide the measured time behind a lossy calculation of bandwidth, but gives the measured data to the user; allowing them to derive their own precise bandwidth if they're so inclined. Kind regards, Matthias van de Meent Neon (https://neon.tech)
Re: Detoasting optionally to make Explain-Analyze less misleading
Hi Thomas, indeed by doing less the code also becomes trivial and ExplainPropertyInteger can be used as a oneliner. My intention was to actually get the realistic payload-bytes from the wire-protocol counted by the serialization option. I am also adding the protocol bits and the length of the data that is generated by serialization output-functions. So it should (hopefully) be the real numbers. Attached is a new revision of the patch that prints kB (floor'ed by integer-division by 1024). Maybe that is also misleading and bytes would be nicer (though harder to read). The output is now as follows: db1=# explain (analyze, serialize) select * from test; QUERY PLAN --- Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual time=0.014..0.017 rows=1 loops=1) Planning Time: 0.245 ms Execution Time: 292.983 ms Serialized Bytes: 77039 kB (4 rows) Definately a lot nicer and controlled by ExplainPropertyInteger in terms of formatting. The main motivation was to actually get a correct feeling for the execution time. Actually counting the bytes gives an impression of what would go over the wire. Only the big numbers matter here of course. Regards, Stepan On 02.11.23 20:32, Tomas Vondra wrote: On 11/2/23 20:09, stepan rutz wrote: Hi Thomas, you are right of course. Thanks! I have attached a new version of the patch that supports the syntax like suggested. The previous patch was insonsistent in style indeed. explain (analyze, serialize) and explain (analyze, serialize binary) That doesn't make too much of a difference for most scenarios I am certain. However the the seralize option itself does. Mostly because it performs the detoasting and that was a trap for me in the past with just plain analyze. Eg this scenario really is not too far fetched in a world where people have large JSONB values. db1=# create table test(id bigint, val text); db1=# insert into test(val) select string_agg(s::text, ',') from (select generate_series(1, 10_000_000) as s) as a1; now we have a cell that has roughly 80Mb in it. A large detoasting that will happen in reallife but in explain(analyze). and then... db1=# explain (analyze) select * from test; QUERY PLAN --- Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual time=0.018..0.020 rows=1 loops=1) Planning Time: 0.085 ms Execution Time: 0.044 ms (3 rows) db1=# explain (analyze, serialize) select * from test; QUERY PLAN --- Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual time=0.023..0.027 rows=1 loops=1) Planning Time: 0.077 ms Execution Time: 303.281 ms Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec (4 rows) db1=# So the explain(analyze) does not process the ~80 MB in 0.044ms in any way of course. Honestly, I see absolutely no point in printing this. I have little idea what to do with the "bytes". We have to transfer this over network, but surely there's other data not included in this sum, right? But the bandwidth seems pretty bogus/useless, as it's calculated from execution time, which includes everything else, not just serialization. So what does it say? It certainly does not include the network transfer. IMO we should either print nothing or just the bytes. I don't see the point in printing the mode, which is determined by the command. Actually I could print the serialized bytes using 1. grouping-separators (eg 78_888_953) or 2. in the way pg_size_pretty does it. If doing it the pg_size_pretty way I am uncertain if it would be ok to query the actual pg_size_pretty function via its (certainly frozen) oid of 3166 and do OidFunctionCall1(3166...) to invoke it. Otherwise I'd say it would be nice if the code from that function would be made available as a utility function for all c-code. Any suggestions on this topic? I'm rather skeptical about this proposal, mostly because it makes it harder to process the explain output in scripts etc. But more importantly, it's a completely separate matter from what this patch does, so if you want to pursue this, I suggest you start a separate thread. If you want to introduce separators, surely this is not the only place that should do it (e.g. why not to do that for "rows" or "cost" estimates)? BTW if you really want to print amount of memory, maybe print it in kilobytes, like every other place in explain.c? Also, explain generally prints stuff in "key: value" style (in text format). regards diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index f1d71bc54e..e5aec7ca58 100644 --- a/src/backend/c
Re: Detoasting optionally to make Explain-Analyze less misleading
On 11/2/23 20:09, stepan rutz wrote: > Hi Thomas, > > you are right of course. Thanks! > > I have attached a new version of the patch that supports the syntax like > suggested. The previous patch was insonsistent in style indeed. > > explain (analyze, serialize) > > and > > explain (analyze, serialize binary) > > That doesn't make too much of a difference for most scenarios I am > certain. However the the seralize option itself does. Mostly because it > performs the detoasting and that was a trap for me in the past with just > plain analyze. > > > Eg this scenario really is not too far fetched in a world where people > have large JSONB values. > > > db1=# create table test(id bigint, val text); > > db1=# insert into test(val) select string_agg(s::text, ',') from (select > generate_series(1, 10_000_000) as s) as a1; > > now we have a cell that has roughly 80Mb in it. A large detoasting that > will happen in reallife but in explain(analyze). > > and then... > > db1=# explain (analyze) select * from test; > QUERY PLAN > --- > Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual > time=0.018..0.020 rows=1 loops=1) > Planning Time: 0.085 ms > Execution Time: 0.044 ms > (3 rows) > > db1=# explain (analyze, serialize) select * from test; > QUERY PLAN > --- > Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual > time=0.023..0.027 rows=1 loops=1) > Planning Time: 0.077 ms > Execution Time: 303.281 ms > Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec > (4 rows) > > db1=# > > So the explain(analyze) does not process the ~80 MB in 0.044ms in any > way of course. Honestly, I see absolutely no point in printing this. I have little idea what to do with the "bytes". We have to transfer this over network, but surely there's other data not included in this sum, right? But the bandwidth seems pretty bogus/useless, as it's calculated from execution time, which includes everything else, not just serialization. So what does it say? It certainly does not include the network transfer. IMO we should either print nothing or just the bytes. I don't see the point in printing the mode, which is determined by the command. > > Actually I could print the serialized bytes using 1. grouping-separators > (eg 78_888_953) or 2. in the way pg_size_pretty does it. > > If doing it the pg_size_pretty way I am uncertain if it would be ok to > query the actual pg_size_pretty function via its (certainly frozen) oid > of 3166 and do OidFunctionCall1(3166...) to invoke it. Otherwise I'd say > it would be nice if the code from that function would be made available > as a utility function for all c-code. Any suggestions on this topic? > I'm rather skeptical about this proposal, mostly because it makes it harder to process the explain output in scripts etc. But more importantly, it's a completely separate matter from what this patch does, so if you want to pursue this, I suggest you start a separate thread. If you want to introduce separators, surely this is not the only place that should do it (e.g. why not to do that for "rows" or "cost" estimates)? BTW if you really want to print amount of memory, maybe print it in kilobytes, like every other place in explain.c? Also, explain generally prints stuff in "key: value" style (in text format). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Detoasting optionally to make Explain-Analyze less misleading
Hi Thomas, you are right of course. Thanks! I have attached a new version of the patch that supports the syntax like suggested. The previous patch was insonsistent in style indeed. explain (analyze, serialize) and explain (analyze, serialize binary) That doesn't make too much of a difference for most scenarios I am certain. However the the seralize option itself does. Mostly because it performs the detoasting and that was a trap for me in the past with just plain analyze. Eg this scenario really is not too far fetched in a world where people have large JSONB values. db1=# create table test(id bigint, val text); db1=# insert into test(val) select string_agg(s::text, ',') from (select generate_series(1, 10_000_000) as s) as a1; now we have a cell that has roughly 80Mb in it. A large detoasting that will happen in reallife but in explain(analyze). and then... db1=# explain (analyze) select * from test; QUERY PLAN --- Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual time=0.018..0.020 rows=1 loops=1) Planning Time: 0.085 ms Execution Time: 0.044 ms (3 rows) db1=# explain (analyze, serialize) select * from test; QUERY PLAN --- Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual time=0.023..0.027 rows=1 loops=1) Planning Time: 0.077 ms Execution Time: 303.281 ms Serialized Bytes: 7953 Bytes. Mode Text. Bandwidth 248.068 MB/sec (4 rows) db1=# So the explain(analyze) does not process the ~80 MB in 0.044ms in any way of course. Actually I could print the serialized bytes using 1. grouping-separators (eg 78_888_953) or 2. in the way pg_size_pretty does it. If doing it the pg_size_pretty way I am uncertain if it would be ok to query the actual pg_size_pretty function via its (certainly frozen) oid of 3166 and do OidFunctionCall1(3166...) to invoke it. Otherwise I'd say it would be nice if the code from that function would be made available as a utility function for all c-code. Any suggestions on this topic? Regards, /Stepan On 02.11.23 18:49, Tomas Vondra wrote: Hi, On 9/15/23 22:09, stepan rutz wrote: Hi, please see a revised version yesterday's mail. The patch attached now provides the following: EXPLAIN(ANALYZE,SERIALIZE) and EXPLAIN(ANALYZE,SERIALIZEBINARY) I haven't looked at the patch in detail yet, but this option name looks a bit strange/inconsistent. Either it should be SERIALIZE_BINARY (to match other multi-word options), or maybe there should be just SERIALIZE with a parameter to determine text/binary (like FORMAT, for example). So we'd do either EXPLAIN (SERIALIZE) EXPLAIN (SERIALIZE TEXT) to get serialization to text (which I guess 99% of people will do), or EXPLAIN (SERIALIZE BINARY) to get binary. regards diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index f1d71bc54e..bb75ac0b73 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -26,6 +26,7 @@ #include "nodes/nodeFuncs.h" #include "parser/analyze.h" #include "parser/parsetree.h" +#include "mb/pg_wchar.h" #include "rewrite/rewriteHandler.h" #include "storage/bufmgr.h" #include "tcop/tcopprot.h" @@ -33,6 +34,7 @@ #include "utils/guc_tables.h" #include "utils/json.h" #include "utils/lsyscache.h" +#include "utils/numeric.h" #include "utils/rel.h" #include "utils/ruleutils.h" #include "utils/snapmgr.h" @@ -47,7 +49,6 @@ ExplainOneQuery_hook_type ExplainOneQuery_hook = NULL; /* Hook for plugins to get control in explain_get_index_name() */ explain_get_index_name_hook_type explain_get_index_name_hook = NULL; - /* OR-able flags for ExplainXMLTag() */ #define X_OPENING 0 #define X_CLOSING 1 @@ -154,7 +155,8 @@ static void ExplainJSONLineEnding(ExplainState *es); static void ExplainYAMLLineStarting(ExplainState *es); static void escape_yaml(StringInfo buf, const char *str); - +static DestReceiver *CreateSerializeDestReceiver(int16 format); +static uint64 GetNumberOfSerializedBytes(DestReceiver *dest); /* * ExplainQuery - @@ -192,6 +194,23 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt, es->settings = defGetBoolean(opt); else if (strcmp(opt->defname, "generic_plan") == 0) es->generic = defGetBoolean(opt); + else if (strcmp(opt->defname, "serialize") == 0) + { + es->serialize = true; + /* optionally check for BINARY as argument */ + if (opt->arg) + { +char *p = defGetString(opt); +if (strcmp(p, "binary") == 0) + es->serialize = es->serializebinary = true; +else + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("unrecognized value for EXPLAIN option \"%s\": \"%s\"", +opt->defname, p), + parser_errposition(pstate, op
Re: Detoasting optionally to make Explain-Analyze less misleading
Hi, On 9/15/23 22:09, stepan rutz wrote: > Hi, > > please see a revised version yesterday's mail. The patch attached now > provides the following: > > EXPLAIN(ANALYZE,SERIALIZE) > > and > > EXPLAIN(ANALYZE,SERIALIZEBINARY) > I haven't looked at the patch in detail yet, but this option name looks a bit strange/inconsistent. Either it should be SERIALIZE_BINARY (to match other multi-word options), or maybe there should be just SERIALIZE with a parameter to determine text/binary (like FORMAT, for example). So we'd do either EXPLAIN (SERIALIZE) EXPLAIN (SERIALIZE TEXT) to get serialization to text (which I guess 99% of people will do), or EXPLAIN (SERIALIZE BINARY) to get binary. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Detoasting optionally to make Explain-Analyze less misleading
Hi, please see a revised version yesterday's mail. The patch attached now provides the following: EXPLAIN(ANALYZE,SERIALIZE) and EXPLAIN(ANALYZE,SERIALIZEBINARY) and timing output. Both options perform the serialization during analyze and provide an additional output in the plan like this: template1=# explain (analyze,serialize) select * from t12 limit 1; QUERY PLAN --- ... Serialized Bytes: 36 bytes Execution Time: 0.035 ms (5 rows) or also this template1=# explain (analyze,serialize) select * from t1; QUERY PLAN - Seq Scan on t1 (cost=0.00..1.02 rows=2 width=19) (actual time=0.101..0.111 rows=5 loops=1) Planning Time: 0.850 ms Serialized Bytes: 85777978 bytes Execution Time: 354.284 ms (4 rows) Its tempting to divide Serialized-Bytes by Execution-Time to get an idea of the serialization bandwidth. This is /dev/null serialization though. The results are length-counted and then discarded. Since detoasting happens implicitly during serialization, the number of bytes becomes huge in this case and accounts for the detoasted lengths as well. I tried to get the number of bytes send for the protocol's messages and the attribute headers correctly. For the actual values I am quite sure I get the correct measures, as one can really tell by sending more values across. Null is 4 bytes on the wire interestingly. I didn't know that, but it makes sense, since its using the same prefix length-field as all values do. I have checked the JBDC driver and it uses binary and text formats depending on an attribute's type oid. So having the SERIALIZEBINARY option is not accurate, as in reality both formats can be occur for the same tuple. Please provide some feedback on the new patch and let me know if this makes sense. In general this kind of option for EXPLAIN is a good thing for sure. Greetings, Stepan On 14.09.23 21:27, stepan rutz wrote: Hi Tom, Hi Matthias, you are right of course. I have looked at the code from printtup.c and made a new version of the patch. Thanks for the MemoryContextReset hint too (@Matthias) This time is called EXPLAIN(ANALYZE,SERIALIZE) (hey, it also sounds nicer phonetically) If the option SERIALIZE is active, the output functions are called and they perform the detoasting, which I have even checked. So things are better this way, however I hardcoded the output option "Text" (format=0). In printtup.c there is an incoming array which applies Text (format=0) or Binary (format=1) for each column individually. I am not sure whether this is even needed. I left in the if-statement from printtup.c which calls the binary output method of a given type. The result of the output is ignored and apparently free'd because of the memory-context-reset at the end. Please also note, that I added a call to DestReceiver's rDestroy hook, which was missing from explain.c before altogether. Feedback is appreciated. /Stepan On 12.09.23 17:26, Tom Lane wrote: Matthias van de Meent writes: Hmm, maybe we should measure the overhead of serializing the tuples instead. The difference between your patch and "serializing the tuples, but not sending them" is that serializing also does the detoasting, but also includes any time spent in the serialization functions of the type. So an option "SERIALIZE" which measures all the time the server spent on the query (except the final step of sending the bytes to the client) would likely be more useful than "just" detoasting. +1, that was my immediate reaction to the proposal as well. Some output functions are far from cheap. Doing only the detoast part seems like it's still misleading. Do we need to go as far as offering both text-output and binary-output options? regards, tom lane diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index 8570b14f62..9090e0f114 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -26,6 +26,7 @@ #include "nodes/nodeFuncs.h" #include "parser/analyze.h" #include "parser/parsetree.h" +#include "mb/pg_wchar.h" #include "rewrite/rewriteHandler.h" #include "storage/bufmgr.h" #include "tcop/tcopprot.h" @@ -47,7 +48,6 @@ ExplainOneQuery_hook_type ExplainOneQuery_hook = NULL; /* Hook for plugins to get control in explain_get_index_name() */ explain_get_index_name_hook_type explain_get_index_name_hook = NULL; - /* OR-able flags for ExplainXMLTag() */ #define X_OPENING 0 #define X_CLOSING 1 @@ -154,7 +154,8 @@ static void ExplainJSONLineEnding(ExplainState *es); static void ExplainYAMLLineStarting(ExplainState *es); static void escape_yaml(StringInfo buf, const char *str); - +static DestReceiver *CreateSerializeDestReceiver(int16 format); +static uint64 GetNumbe
Re: Detoasting optionally to make Explain-Analyze less misleading
Hi Tom, Hi Matthias, you are right of course. I have looked at the code from printtup.c and made a new version of the patch. Thanks for the MemoryContextReset hint too (@Matthias) This time is called EXPLAIN(ANALYZE,SERIALIZE) (hey, it also sounds nicer phonetically) If the option SERIALIZE is active, the output functions are called and they perform the detoasting, which I have even checked. So things are better this way, however I hardcoded the output option "Text" (format=0). In printtup.c there is an incoming array which applies Text (format=0) or Binary (format=1) for each column individually. I am not sure whether this is even needed. I left in the if-statement from printtup.c which calls the binary output method of a given type. The result of the output is ignored and apparently free'd because of the memory-context-reset at the end. Please also note, that I added a call to DestReceiver's rDestroy hook, which was missing from explain.c before altogether. Feedback is appreciated. /Stepan On 12.09.23 17:26, Tom Lane wrote: Matthias van de Meent writes: Hmm, maybe we should measure the overhead of serializing the tuples instead. The difference between your patch and "serializing the tuples, but not sending them" is that serializing also does the detoasting, but also includes any time spent in the serialization functions of the type. So an option "SERIALIZE" which measures all the time the server spent on the query (except the final step of sending the bytes to the client) would likely be more useful than "just" detoasting. +1, that was my immediate reaction to the proposal as well. Some output functions are far from cheap. Doing only the detoast part seems like it's still misleading. Do we need to go as far as offering both text-output and binary-output options? regards, tom lane diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index 8570b14f62..dc52019a3e 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -47,7 +47,6 @@ ExplainOneQuery_hook_type ExplainOneQuery_hook = NULL; /* Hook for plugins to get control in explain_get_index_name() */ explain_get_index_name_hook_type explain_get_index_name_hook = NULL; - /* OR-able flags for ExplainXMLTag() */ #define X_OPENING 0 #define X_CLOSING 1 @@ -155,6 +154,8 @@ static void ExplainYAMLLineStarting(ExplainState *es); static void escape_yaml(StringInfo buf, const char *str); +static DestReceiver *CreateSerializeDestReceiver(void); + /* * ExplainQuery - @@ -192,6 +193,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt, es->settings = defGetBoolean(opt); else if (strcmp(opt->defname, "generic_plan") == 0) es->generic = defGetBoolean(opt); + else if (strcmp(opt->defname, "serialize") == 0) + es->serialize = defGetBoolean(opt); else if (strcmp(opt->defname, "timing") == 0) { timing_set = true; @@ -244,6 +247,12 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("EXPLAIN option TIMING requires ANALYZE"))); + /* check that serialize is used with EXPLAIN ANALYZE */ + if (es->serialize && !es->analyze) + ereport(ERROR, +(errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("EXPLAIN option SERIALIZE requires ANALYZE"))); + /* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */ if (es->generic && es->analyze) ereport(ERROR, @@ -565,9 +574,16 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es, /* * Normally we discard the query's output, but if explaining CREATE TABLE * AS, we'd better use the appropriate tuple receiver. + * Also if we choose to serialize during explain, we need to send the + * tuples to a receiver that performs the serializing + * which includes detoasting and the conversion of the attributes + * into the protocol receiving format. The latter can also be costly + * for some types */ if (into) dest = CreateIntoRelDestReceiver(into); + else if (es->analyze && es->serialize) + dest = CreateSerializeDestReceiver(); else dest = None_Receiver; @@ -606,6 +622,9 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es, /* run cleanup too */ ExecutorFinish(queryDesc); + /* call the dest receiver's destroy method even during explain */ + dest->rDestroy(dest); + /* We can't run ExecutorEnd 'till we're done printing the stats... */ totaltime += elapsed_time(&starttime); } @@ -5052,3 +5071,193 @@ escape_yaml(StringInfo buf, const char *str) { escape_json(buf, str); } + + +/* Serializing DestReceiver functions + * + * the idea is that running EXPLAIN (ANALYZE) sometimes gives results that + * are way unrealistic, since during explain there is no detoasting of + * values. That means you get no idea what your actual query time is going + * to be, nor what the actual overhead of detoasting is. + * + * Also the output/send functions o
Re: Detoasting optionally to make Explain-Analyze less misleading
Hi Stepan & all, On Tue, 12 Sep 2023 17:16:00 +0200 stepan rutz wrote: ... > Attached a new patch. Hoping for feedback, Nice addition to EXPLAIN! On the feature front, what about adding the actual detoasting/serializing time in the explain output? That could be: => explain (analyze,serialize,costs off,timing off) select * from test_detoast; QUERY PLAN ─ Seq Scan on public.test_detoast (actual rows=Nv loops=N) Planning Time: N ms Execution Time: N ms Serialize Time: N ms
Re: Detoasting optionally to make Explain-Analyze less misleading
Matthias van de Meent writes: > Hmm, maybe we should measure the overhead of serializing the tuples instead. > The difference between your patch and "serializing the tuples, but not > sending them" is that serializing also does the detoasting, but also > includes any time spent in the serialization functions of the type. So > an option "SERIALIZE" which measures all the time the server spent on > the query (except the final step of sending the bytes to the client) > would likely be more useful than "just" detoasting. +1, that was my immediate reaction to the proposal as well. Some output functions are far from cheap. Doing only the detoast part seems like it's still misleading. Do we need to go as far as offering both text-output and binary-output options? regards, tom lane
Re: Detoasting optionally to make Explain-Analyze less misleading
Hi Matthias, thanks for your feedback. I wasn't sure on the memory-contexts. I was actually also unsure on whether the array TupleTableSlot.tts_isnull is also set up correctly by the previous call to slot_getallattrs(slot). This would allow to get rid of even more code from this patch, which is in the loop and determines whether a field is null or not. I switched to using tts_isnull from TupleTableSlot now, seems to be ok and the docs say it is save to use. Also I reuse the MemoryContext throughout the livetime of the receiver. Not sure if that makes a difference. One more thing I noticed. During explain.c the DestReceiver's destroy callback was never invoked. I added a line to do that, however I am unsure whether this is the right place or a good idea in the first place. This potentially affects plain analyze calls as well, though seems to behave nicely. Even when I explain (analyze) select * into This is the call I am talking about, which was missing from explain.c dest->rDestroy(dest); Attached a new patch. Hoping for feedback, Greetings, Stepan On 12.09.23 14:25, Matthias van de Meent wrote: On Tue, 12 Sept 2023 at 12:56, stepan rutz wrote: Hi, I have fallen into this trap and others have too. If you run EXPLAIN(ANALYZE) no de-toasting happens. This makes query-runtimes differ a lot. The bigger point is that the average user expects more from EXPLAIN(ANALYZE) than what it provides. This can be suprising. You can force detoasting during explain with explicit calls to length(), but that is tedious. Those of us who are forced to work using java stacks, orms and still store mostly documents fall into this trap sooner or later. I have already received some good feedback on this one, so this is an issue that bother quite a few people out there. Yes, the lack of being able to see the impact of detoasting (amongst others) in EXPLAIN (ANALYZE) can hide performance issues. It would be great to get some feedback on the subject and how to address this, maybe in totally different ways. Hmm, maybe we should measure the overhead of serializing the tuples instead. The difference between your patch and "serializing the tuples, but not sending them" is that serializing also does the detoasting, but also includes any time spent in the serialization functions of the type. So an option "SERIALIZE" which measures all the time the server spent on the query (except the final step of sending the bytes to the client) would likely be more useful than "just" detoasting. 0001_explain_analyze_and_detoast.patch I notice that this patch creates and destroys a memory context for every tuple that the DestReceiver receives. I think that's quite wasteful, as you should be able to create only one memory context and reset it before (or after) each processed tuple. That also reduces the differences in measurements between EXPLAIN and normal query processing of the tuples - after all, we don't create new memory contexts for every tuple in the normal DestRemote receiver either, right? Kind regards, Matthias van de Meent diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index 8570b14f62..95a09dd4cb 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -47,7 +47,6 @@ ExplainOneQuery_hook_type ExplainOneQuery_hook = NULL; /* Hook for plugins to get control in explain_get_index_name() */ explain_get_index_name_hook_type explain_get_index_name_hook = NULL; - /* OR-able flags for ExplainXMLTag() */ #define X_OPENING 0 #define X_CLOSING 1 @@ -155,6 +154,8 @@ static void ExplainYAMLLineStarting(ExplainState *es); static void escape_yaml(StringInfo buf, const char *str); +static DestReceiver *CreateDetoastDestReceiver(void); + /* * ExplainQuery - @@ -192,6 +193,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt, es->settings = defGetBoolean(opt); else if (strcmp(opt->defname, "generic_plan") == 0) es->generic = defGetBoolean(opt); + else if (strcmp(opt->defname, "detoast") == 0) + es->detoast = defGetBoolean(opt); else if (strcmp(opt->defname, "timing") == 0) { timing_set = true; @@ -244,6 +247,12 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("EXPLAIN option TIMING requires ANALYZE"))); + /* check that detoast is used with EXPLAIN ANALYZE */ + if (es->detoast && !es->analyze) + ereport(ERROR, +(errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("EXPLAIN option DETOAST requires ANALYZE"))); + /* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */ if (es->generic && es->analyze) ereport(ERROR, @@ -565,9 +574,13 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es, /* * Normally we discard the query's output, but if explaining CREATE TABLE * AS, we'd better use the appropriate tuple receiver. + * Also if we choose to detoast during explain, we need to send the + * tuples to a receiver that
Re: Detoasting optionally to make Explain-Analyze less misleading
On Tue, 12 Sept 2023 at 12:56, stepan rutz wrote: > > Hi, > > I have fallen into this trap and others have too. If you run > EXPLAIN(ANALYZE) no de-toasting happens. This makes query-runtimes > differ a lot. The bigger point is that the average user expects more > from EXPLAIN(ANALYZE) than what it provides. This can be suprising. You > can force detoasting during explain with explicit calls to length(), but > that is tedious. Those of us who are forced to work using java stacks, > orms and still store mostly documents fall into this trap sooner or > later. I have already received some good feedback on this one, so this > is an issue that bother quite a few people out there. Yes, the lack of being able to see the impact of detoasting (amongst others) in EXPLAIN (ANALYZE) can hide performance issues. > It would be great to get some feedback on the subject and how to address > this, maybe in totally different ways. Hmm, maybe we should measure the overhead of serializing the tuples instead. The difference between your patch and "serializing the tuples, but not sending them" is that serializing also does the detoasting, but also includes any time spent in the serialization functions of the type. So an option "SERIALIZE" which measures all the time the server spent on the query (except the final step of sending the bytes to the client) would likely be more useful than "just" detoasting. > 0001_explain_analyze_and_detoast.patch I notice that this patch creates and destroys a memory context for every tuple that the DestReceiver receives. I think that's quite wasteful, as you should be able to create only one memory context and reset it before (or after) each processed tuple. That also reduces the differences in measurements between EXPLAIN and normal query processing of the tuples - after all, we don't create new memory contexts for every tuple in the normal DestRemote receiver either, right? Kind regards, Matthias van de Meent
Detoasting optionally to make Explain-Analyze less misleading
Hi, I have fallen into this trap and others have too. If you run EXPLAIN(ANALYZE) no de-toasting happens. This makes query-runtimes differ a lot. The bigger point is that the average user expects more from EXPLAIN(ANALYZE) than what it provides. This can be suprising. You can force detoasting during explain with explicit calls to length(), but that is tedious. Those of us who are forced to work using java stacks, orms and still store mostly documents fall into this trap sooner or later. I have already received some good feedback on this one, so this is an issue that bother quite a few people out there. Attached is a patch for addressing the issue in form of adding another parameter to explain. I don't know if that is a good idea, but I got some feedback that a solution to this problem would be appreciated by some people out there. It would also be nice to reflect the detoasting in the "buffers" option of explain as well. The change for detoasting is only a few lines though. So the idea was to allow this EXPLAIN (ANALYZE, DETOAST) SELECT * FROM sometable; and perform the detoasting step additionally during the explain. This just gives a more realistic runtime and by playing around with the parameter and comparing the execution-times of the query one even gets an impression about the detoasting cost involved in a query. Since the parameter is purely optional, it would not affect any existing measures. It is not uncommon that the runtime of explain-analyze is way unrealistic in the real world, where people use PostgreSQL to store larger and larger documents inside tables and not using Large-Objects. Here is a video of the effect (in an exagerated form): https://www.stepanrutz.com/short_detoast_subtitles.mp4 It would be great to get some feedback on the subject and how to address this, maybe in totally different ways. Greetings from cologne, Stepan Stepan Rutz - IT Consultant, Cologne Germany, stepan.rutz AT gmx.de diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index 8570b14f62..50859e6a1e 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -47,6 +47,8 @@ ExplainOneQuery_hook_type ExplainOneQuery_hook = NULL; /* Hook for plugins to get control in explain_get_index_name() */ explain_get_index_name_hook_type explain_get_index_name_hook = NULL; +/* Forward decl for detoasting analyze */ +static const DestReceiver detoastAnalyzeDR; /* OR-able flags for ExplainXMLTag() */ #define X_OPENING 0 @@ -192,6 +194,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt, es->settings = defGetBoolean(opt); else if (strcmp(opt->defname, "generic_plan") == 0) es->generic = defGetBoolean(opt); + else if (strcmp(opt->defname, "detoast") == 0) + es->detoast = defGetBoolean(opt); else if (strcmp(opt->defname, "timing") == 0) { timing_set = true; @@ -244,6 +248,12 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("EXPLAIN option TIMING requires ANALYZE"))); + /* check that detoast is used with EXPLAIN ANALYZE */ + if (es->detoast && !es->analyze) + ereport(ERROR, +(errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("EXPLAIN option DETOAST requires ANALYZE"))); + /* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */ if (es->generic && es->analyze) ereport(ERROR, @@ -565,9 +575,13 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es, /* * Normally we discard the query's output, but if explaining CREATE TABLE * AS, we'd better use the appropriate tuple receiver. + * Also if we choose to detoast during explain, we need to send the + * tuples to a receiver that performs the detoasting */ if (into) dest = CreateIntoRelDestReceiver(into); + else if (es->analyze && es->detoast) + dest = &detoastAnalyzeDR; else dest = None_Receiver; @@ -5052,3 +5066,98 @@ escape_yaml(StringInfo buf, const char *str) { escape_json(buf, str); } + + +/* + * detoasting DestReceiver functions + * + */ +/* the idea is that running EXPLAIN (ANALYZE) sometimes gives results that + * are way unrealistic, since during explain there is no detoasting of + * values. That means you get no idea what your actual query time is going + * to be, nor what the actual overhead of detoasting is. + * + * if you run EXPLAIN (ANAYLZE, DETOAST) then you get an impression of + * what is actually happening when you use postgresql as a document-database + * or have large values for some other reason. + */ + +/* this method receives the tuples/records during explain (analyze, detoast) + * and detoasts them. + * this method is otherwise not called and its purpose is to detoast to + * make analyze optionally more realistic + */ +static bool +detoastAnalyzeReceive(TupleTableSlot *slot, DestReceiver *self) +{ + MemoryContext cxt; + MemoryContext oldcxt; + int i; + bool shouldFree = true; + TupleDesc