On 02/15/2013 12:45 AM, Peter Eisentraut wrote: > On 2/11/13 10:22 PM, Greg Stark wrote: >> On Sun, Feb 10, 2013 at 11:47 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >>> If we knew that postgresql.conf was stored in, say, UTF8, then it would >>> probably be possible to perform encoding conversion to get string >>> variables into the database encoding. Perhaps we should allow some >>> magic syntax to tell us the encoding of a config file? >>> >>> file_encoding = 'utf8' # must precede any non-ASCII in the file >> If we're going to do that we might as well use the Emacs standard >> -*-coding: latin-1;-*- > Yes, or more generally perhaps what Python does: > http://docs.python.org/2.7/reference/lexical_analysis.html#encoding-declarations > > (In Python 2, the default is ASCII, in Python 3, the default is UTF8.) Not that Python also respects a BOM in a UTF-8 file, treating the BOM as flagging the file as being UTF-8.
"In addition, if the first bytes of the file are the UTF-8 byte-order mark ('\xef\xbb\xbf'), the declared file encoding is UTF-8." IMO we should do the same. If there's no explicit encoding declaration, treat it as UTF-8 if there's a BOM and as the platform's local character encoding otherwise. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers