On 12/4/05, Tom Lane <[EMAIL PROTECTED]> wrote:
> Paul Lindner <[EMAIL PROTECTED]> writes:
> > On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote:
> >> Paul Lindner <[EMAIL PROTECTED]> writes:
> >>> iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
> >>
> >> Is that really a one-size-fits-all solution?  Especially with -c?
>
> > I'd say yes, and the -c flag is needed so iconv strips out the
> > invalid characters.
>
> That's exactly what's bothering me about it.  If we recommend that
> we had better put a large THIS WILL DESTROY YOUR DATA warning first.
> The problem is that the data is not "invalid" from the user's point
> of view --- more likely, it's in some non-UTF8 encoding --- and so
> just throwing away some of the characters is unlikely to make people
> happy.

Nor is it even guarenteed to make the data load: If the column is
unique constrained and the removal of the non-UTF characters makes two
rows have the same data where they didn't before...

The way to preserve the data is to switch the column to be a bytea.

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to