Tom Lane wrote:
Putting a BOM into UTF8 data is flat out invalid per spec --- the fact
that Microsloth does it does not make it standards-conformant.
Could you share a pointer to the spec?
All I've ever heard is that a BOM is optional for UTF-8 but not forbidden.
The Unicode FAQ (http://unicode.org/faq/utf_bom.html#BOM) states "that
some recipients of UTF-8 encoded data do not expect a BOM".
Postgres obviously belongs to those recipients.
That's why all my psql-scripts transferring data from MSSQL to Postgres
need a '\! perl -CD -pi.orig -e "tr/\x{feff}//d" "C:/datafile.txt"'
before feeding data into COPY TO.
Reading it tolerantly and writing it on user request is probably the way
that would help most users.
Regards,
Brar
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers