I wrote: > Marco Atzeri <marco.atz...@gmail.com> writes: >> Building on Cygwin latest 10 beta1 or head sourece, >> make check fails as: >> ... >> performing post-bootstrap initialization ... 2017-05-31 23:23:22.214 >> CEST [16860] FATAL: collation "ja_JP" for encoding "EUC_JP" already exists
> Hmph. Could we see the results of "locale -a | grep ja_JP" ? Despite the lack of followup from the OP, I'm pretty troubled by this report. It shows that the reimplementation of OS collation data import as pg_import_system_collations() is a whole lot more fragile than the original coding. We have never before trusted "locale -a" to not produce duplicate outputs, not since the very beginning in 414c5a2e. AFAICS, the current coding has also lost the protections we added very shortly after that in 853c1750f; and it has also lost the admittedly rather arbitrary, but at least deterministic, preference order for conflicting short aliases that was in the original initdb code. I suppose the idea was to see whether we actually needed those defenses, but since we have here a failure report after less than a month of beta, it seems clear to me that we do. I think we need to upgrade pg_import_system_collations to have all the same logic that was there before. Now the hard part of that is that because pg_import_system_collations isn't using a temporary staging table, but is just inserting directly into pg_collation, there isn't any way for it to eliminate duplicates unless it uses if_not_exists behavior all the time. So there seem to be two ways to proceed: 1. Drop pg_import_system_collations' if_not_exists argument and just define it as adding any collations not already known in pg_collation. 2. Significantly rewrite it so that it de-dups the collation set by hand before trying to insert into pg_collation. #2 seems like a lot more work, but on the other hand, we might need most of that logic anyway to get back deterministic alias handling. However, since I cannot see any real-world use case at all for if_not_exists = false, I figure we might as well do #1 and take whatever simplification we can get that way. I'm willing to do the legwork on this, but before I start, does anyone have any ideas or objections? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers