On Mon, Sep 26, 2011 at 11:09 AM, Tatsuo Ishii <is...@postgresql.org> wrote: >> "David E. Wheeler" <da...@kineticode.com> >> <cajw2+qdyg1+xlahdqnjs3ackmcsvcdkv_lcapwutwmxl9dz...@mail.gmail.com> writes: >>> On Sep 25, 2011, at 9:58 PM, Itagaki Takahiro wrote: >>>> I'm thinking about only COPY FROM for reads, but if someone wants to add >>>> BOM in COPY TO, we might also support COPY TO WITH BOM for writes. >> >>> I think it would have to be optional, since "some recipients of UTF-8 >>> encoded data do not expect a BOM." >> >> Putting a BOM into UTF8 data is flat out invalid per spec --- the fact >> that Microsloth does it does not make it standards-conformant. >> >> I think that accepting it on input can be sensible, on the principle of >> "be liberal in what you accept", but the other side of that is "be >> conservative in what you send". No BOMs in output, please. > > Suppose a user uses brain-dead editor, which does not accept UTF-8 > without BOM. He decides to save his editor data into PostgreSQL using > COPY FROM. He extracts the data using COPY TO. Now he finds that his > stupid editor does not accept his data any more. > > So I think if we decide to accept UTF-8 with BOM, we should keep BOM > when importing the data and output the data with BOM. If we don't want > to output UTF-8 with BOM, we should not accept UTF-8 with BOM. It > seems we don't have much choice...
Maybe this needs to be an optional behavior, controlled by some COPY option. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers