On 2015-06-04 11:51:44 +0300, Heikki Linnakangas wrote: > I think this explanation is wrong. I agree that there are many places that > would be good to refactor - like StartupXLOG() - but the multixact code was > not too bad in that regard. IIRC the patch included some refactoring, it > added some new helper functions in heapam.c, for example. You can argue that > it didn't do enough of it, but that was not the big issue.
Yea, but the bugs were more around the interactions to other parts of the system. Like e.g. crash recovery, which now is about bug 7 or so. And those are the ones that are hard to understand. > The big issue was at the architecture level. Basically, we liked vacuuming > of XIDs and clog so much that we decided that it'd be nice if you had to > vacuum multixids too, in order to not lose data. Many of the bugs and issues > were not new - we had multixids before - but we upped the ante and turned > minor locking bugs into data loss. And that had nothing to do with the code > structure - we'd have similar issues if we had rewritten everything java, > with the same design. I think we're probably just using slightly different terms here - for me one very good way of fixing some structurally bad things *is* improving the design. If you look at the bugs around multixacts: The first few were around ctid-chaining, hard to find and fix because there's about 8-10 places implementing it with slight differences. The next bunch were around vacuuming, some of them oversights, a good bunch of them more fundamental. Crash recovery wasn't thought about (lack of testing/review), and more generally the new code tripped over bad old decisions (hey, wraparound is ok!). Then there were a bunch of stupid bugs in crash-recovery (testing mainly), and larger scale bugs (hey, let's access stuff during recovery). Then there's the whole row level locking code - which is by now among the hardest to understand code in postgres - and voila it contained a bunch of oversights that were hard to spot. So yes, I think nicer code to work with would have prevented us from making a significant portion of these. It might have also made us realize earlier how significant the increase in complexity was. > So, I'm all for refactoring and adding abstractions where it makes sense, > but it's not going to solve design problems. I personally don't really see the multixact changes being that bad on the overall design. It pretty much just extended an earlier design. Now that wasn't great, but I don't think too many people had realized that at that point. The biggest problem was underestimating the complexity. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers