--- PFC <[EMAIL PROTECTED]> wrote: > > > $ iconv -f US-ASCII -t UTF-8 < test.sql > out.sql > > iconv: illegal input sequence at position 114500 > > > > Any ideas how the job can be accomplised reliably. > > > > Also my database may contain data in multiple encodings > > like WINDOWS-1251 and WINDOWS-1256 in various places > > as data has been inserted by different peoples using > > different sources and client software. > > You could use a simple program like that (in Python): > > output = open( "unidump", "w" ) > for line in open( "your dump" ): > for encoding in "utf-8", "iso-8859-15", "whatever": > try: > output.write( unicode( line, encoding ).encode( "utf-8" > )) > break > except UnicodeError: > pass > else: > print "No suitable encoding for line..."
This may not work . Becuase ,conversion to utf-8 can be successfull (no runtime error) even for an incorrect guess of the original encoding but the result will be an incorrect utf8. Regds Rajesh Kumar Mallah > > I'd say this might work, if UTF-8 cannot absorb an apostrophe inside a > multibit character. Can it ? > > Or you could do that to all your table using SELECTs but it's going to > be > painful... > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings > __________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq