Hello, While reviewing the ICU versioning work a while back, I mentioned the idea of using a user-supplied command to get a collversion string for the libc collation provider. I was reminded about that by recent news about an upcoming glibc/CLDR resync that is likely to affect PostgreSQL users (though, I guess, probably only when they do a major OS upgrade). Here's an experimental patch to try that idea out. For example, you might set it like this:
libc_collation_version_command = 'md5 /usr/share/locale/@LC_COLLATE@/LC_COLLATE | sed "s/.* = //"' ... or, on a Debian system using the locales package, like this: libc_collation_version_command = 'dpkg -s locales | grep Version: | sed "s/Version: //"' Using the checksum approach, it works like this: postgres=# alter collation "xx_XX" refresh version; NOTICE: changing version from b88d621596b7e61337e832f7841066a9 to 7b008442fbaf5dfe7a10fb3d82a634ab ALTER COLLATION postgres=# select * from pg_collation where collname = 'xx_XX'; -[ RECORD 1 ]-+--------------------------------- collname | xx_XX collnamespace | 2200 collowner | 10 collprovider | c collencoding | 6 collcollate | en_US.UTF-8 collctype | UTF-8 collversion | 7b008442fbaf5dfe7a10fb3d82a634ab When the collation definition changes you get the desired scary warning on next attempt to use it in a fresh backend: postgres=# select * from t order by v; WARNING: collation "xx_XX" has version mismatch DETAIL: The collation in the database was created using version b88d621596b7e61337e832f7841066a9, but the operating system provides version 7b008442fbaf5dfe7a10fb3d82a634ab. HINT: Rebuild all objects affected by this collation and run ALTER COLLATION public."xx_XX" REFRESH VERSION, or build PostgreSQL with the right library version. The problem is that it isn't in effect at initdb time so if you add that later it only affects new locales. You'd need a way to do that during init to capture the imported system locale versions, and that's a really ugly string to have to pass into some initdb option. Ugh. Another approach would be to decide that we're willing to put non-portable version extracting magic in pg_locale.c. On a long flight I hacked my libc to store a version string (based on CLDR version or whatever) in its binary locale definitions and provide a proper interface to ask for it, modelled on querylocale(3): const char *querylocaleversion(int mask, locale_t locale); Then the patch for pg_locale.c is trivial, see attached. While I could conceivably try to convince my local friendly OS to take such a patch, the real question is how to deal with glibc. Does anyone know of a way to extract a version string from glibc using existing interfaces? I heard there was an undocumented way but I haven't been able to find it -- probably because I was, erm, looking in the documentation. Or maybe this isn't worth bothering with, and we should just build out the ICU support and then make it the default and be done with it. In passing, here's a patch to add tab completion for ALTER COLLATION ... REFRESH VERSION. -- Thomas Munro http://www.enterprisedb.com
0001-Auto-complete-ALTER-COLLATION-.-REFRESH-VERSION.patch
Description: Binary data
0001-Add-libc_collation_version_command-GUC.patch
Description: Binary data
0001-Add-libc_collation_version_command-GUC.patch
Description: Binary data