John Naylor <jcnay...@gmail.com> writes: > On 1/12/18, Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote: >> Pushed 0001 with some changes of my own. I'm afraid I created a few >> conflicts for the later patches in your series; please rebase.
> Attached, rebased over 255f14183ac. I decided that I wasn't going to get answers about the things I cared about without looking through the patchset, so I've now done so, in a once-over-lightly fashion. Here's a quick summary of what it does, for those who may be equally confused, and then some comments (not really a review). The patch removes DATA and (SH)DESCR lines from all the catalog/pg_*.h files, as well as the #defines for OID-value macros, and puts that information into pg_*.dat files corresponding to the .h files, in a form that is easily readable and writable by Perl scripts. Comments associated with this info are also transferred to the .dat files, and the scripts that can rewrite the .dat files are able to preserve the comments. genbki.pl is able to generate postgres.bki and other initdb input files directly from the .dat files. It also creates a single header file src/include/catalog/oid_symbols.h that contains all of the OID-value macros that were formerly in the pg_*.h files. The patch gets rid of the need to write hard-wired OIDs when referencing operators, opfamilies, etc in the .dat files; now you can write their names instead. genbki.pl will look up the names and substitute numeric OIDs in the emitted postgres.bki file. There are also provisions to shorten the .dat files through the use of abbreviated field names, default values for fields, and some other less-general techniques. -- OK, now some comments: I'm not sure about the decision to move all the OID macros into one file; that seems like namespace pollution. It's especially odd that you did that but did not consolidate fmgroids.h with the macros for symbols from other catalogs. Now it's true that we need all those symbols to be distinct, since it needs to be possible for .c files to include any combination of pg_*.h headers, but I don't think it's an especially good idea that you have to include all of them or none. Even if we're willing to put up with that namespace pollution for backend code, there are clients that currently include pg_*.h headers to get some of those macros, and they're likely to be less happy about it. The design I'd kind of imagined was one generated file of #define's per pg_*.h file, not just one giant one. It would be really nice, also, if the attribute number macros (Natts_pg_proc, Anum_pg_proc_proname, etc) could be autogenerated. Manually renumbering those is one of the bigger pains in the rear when adding catalog columns. It was less of a pain than adjusting the DATA lines of course, so I never figured it was worth doing something about in isolation --- but with this infrastructure in place, that's manual work we shouldn't have to do anymore. Another thing that I'd sort of hoped might happen from this patchset is to cure the problem of keeping some catalog headers safe for client-side inclusion, because some clients want the OID value macros and/or macros for column values (eg PROVOLATILE_IMMUTABLE), so they currently have to #include those headers or else hard-code the values. We've worked around that to date with ad-hoc solutions like splitting function declarations out to pg_*_fn.h files, but I never liked that much. With the OID value macros being moved out to separate generated file(s), there's now a possibility that we could fix this once and for all by making client-side code include those file(s) not pg_type.h et al themselves. But we'd need a way to put the column-value macros into those files too; maybe that's too messy to make it practical. The .dat files need to have header comments that follow project conventions, in particular they need to contain copyright statements. Likewise for generated files. I've got zero faith that the .h files will hold still long enough for these patches to be reviewed and applied. The ones that touch significant amounts of data need to be explained as "run this script on the current data", rather than presented as static diffs. I'm not really thrilled by the single-purpose "magic" behaviors added in 0007, such as computing prosrc from proname. I think those will add more confusion than they're worth. In 0010, you relabel the types of some OID columns so that genbki.pl will know which lookup to apply to them. That's not such a problem for the relabelings that are just macros and genbki.pl converts back to type OID in the .bki file. But you also did things like s/Oid/regtype/, and that IS a problem because it will affect what client code sees in those catalog columns. We've discussed changing those columns to regfoo types in the past, and decided not to, because of the likelihood of breaking client queries. I do not think this patch gets to change that policy. So the way to identify the lookup rule needs to be independent of whether the column is declared as Oid or an Oid alias type. Perhaps an explicit marker telling what transformation to make, like Oid rngsubtype BKI_LOOKUP(pg_type); would work for that. I'm not really on board at all with 0012, which AFAICS moves the indexing and toast-table information out of indexing.h and toasting.h for no good reason whatever. We'll have quite enough code thrash and pending-patch breakage from this patch set; we don't need to take on rearrangements that aren't buying anything. regards, tom lane