Re: [Slony1-general] strategy to fix utf8 encoding errors

Jeff Frost Sat, 27 May 2006 21:10:46 -0700

On Sat, 27 May 2006, Vivek Khera wrote:

> Aside from playing whack-a-mole and fixing the errors one at a time as they 
> are reported by slon, what can I do to make the data UTF8 safe for the strict 
> checking of Pg 8.1?
>
> And what does one do to figure out what character to replace or do you 
> generally just cut the offending character from the row?


Generally you use iconv but that's with a dump/reload.  I'm not sure if you 
could hook that into slony somehow.

Here's a snippet from the HISTORY file:

      * Some users are having problems loading UTF-8 data into 8.1.X. This
        is because previous versions allowed invalid UTF-8 byte sequences
        to be entered into the database, and this release properly accepts
        only valid UTF-8 sequences. One way to correct a dumpfile is to run
        the command "iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql
        dumpfile.sql". The -c option removes invalid character sequences. A
        diff of the two files will show the sequences that are invalid.
        "iconv" reads the entire input file into memory so it might be
        necessary to use split to break up the dump into multiple smaller
        files for processing.


-- 
Jeff Frost, Owner       <[EMAIL PROTECTED]>
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908     FAX: 650-649-1954
_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general

Re: [Slony1-general] strategy to fix utf8 encoding errors

Reply via email to