Hi, On 2019-10-22 11:58:35 -0700, Maciek Sakrejda wrote: > I originally reported this in pgsql-bugs [0], but there wasn't much > feedback there, so moving the discussion here. When using JSON, YAML, or > XML-format EXPLAIN on a plan that uses a parallelized sort, the Sort nodes > list two different entries for "Workers", one for the sort-related info, > and one for general worker info. This is what this looks like in JSON (some > details elided): > > { > "Node Type": "Sort", > ... > "Workers": [ > { > "Worker Number": 0, > "Sort Method": "external merge", > "Sort Space Used": 20128, > "Sort Space Type": "Disk" > }, > { > "Worker Number": 1, > "Sort Method": "external merge", > "Sort Space Used": 20128, > "Sort Space Type": "Disk" > } > ], > ... > "Workers": [ > {
> This is technically valid JSON, but it's extremely difficult to work with, > since default JSON parsing in Ruby, node, Python, Go, and even Postgres' > own jsonb only keep the latest key It's also quite confusing. > As Tom Lane pointed out in my pgsql-bugs thread, this has been > reported before [1] and in that earlier thread, Andrew Dunstan suggested > that perhaps the simplest solution is to just rename the sort-related > Workers node. Thoughts? Yea, I think we should fix this. The current output basically makes no sense. > Tom expressed some concerns about a breaking change here, > though I think the current behavior means vanishingly few users are parsing > this data correctly. Well, in a lot of the cases there's no parallel output for the sort, and in other cases BUFFERS is not specified. In either case the 'duplicate key' problem won't exist then. While Tom said: On 2019-10-16 09:16:56 +0200, Tom Lane wrote: > I think the text-mode output is intentional, but the other formats > need more work. Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 I don't think this is close to being good enough to be worth preserving. I think it's worth avoiding unnecessary breakage of explain output, but we also shouldn't endlessly carry forward confusing output, just because of that. It clearly seems like it'd be better if this instead were Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 Sort Method: external merge Disk: 5920kB I think the way this information was added in bf11e7ee2e36 and 33001fd7a707, contrasting to the output added in b287df70e408, is just not right. If we add similar instrumentation reporting to more nodes, we'll end up with duplicated information all over. Additionally the per-worker part of show_sort_info() basically just duplicated the rest of the function. I then also did something similar (although luckily with a different key...), with the ExplainPrintJIT() call for Gather nodes. Unfortunately I think the fix isn't all that trivial, due to the way we output the per-worker information at the end of ExplainNode(), by just dumping things into a string. It seems to me that a step in the right direction would be for ExplainNode() to create planstate->worker_instrument StringInfos, which can be handed to routines like show_sort_info(), which would print the per-node information into that, rather than directly dumping into es->output. Most of the current "Show worker detail" would be done earlier in ExplainNode(), at the place where we current display the "actual rows" bit. ISTM that should include removing the duplication fo the the contents of show_sort_info(), and probably also for the Gather, GatherMerge blocks (I've apparently skipped adding the JIT information to the latter, not sure if we ought to fix that in the stable branches). Any chance you want to take a stab at that? I don't think we'll fix it soon, but damn, all this string appending around just isn't a great way to reliably build nested data formats. Greetings, Andres Freund