Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-24 Thread Arnaud Lesauvage
Tomi NA a écrit : 2006/11/23, Arnaud Lesauvage [EMAIL PROTECTED]: Arnaud Lesauvage a écrit : Brandon Aiken a écrit : It also might be a big/little endian problem, although I always thought that was platform specific, not locale specific. Try the UCS-2-INTERNAL and UCS-4-INTERNAL codepages

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-23 Thread Arnaud Lesauvage
Brandon Aiken a écrit : It also might be a big/little endian problem, although I always thought that was platform specific, not locale specific. Try the UCS-2-INTERNAL and UCS-4-INTERNAL codepages in iconv, which should use the two-byte or four-byte versions of UCS encoding using the system's

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-23 Thread Arnaud Lesauvage
Arnaud Lesauvage a écrit : Brandon Aiken a écrit : It also might be a big/little endian problem, although I always thought that was platform specific, not locale specific. Try the UCS-2-INTERNAL and UCS-4-INTERNAL codepages in iconv, which should use the two-byte or four-byte versions of UCS

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-23 Thread Tomi NA
2006/11/23, Arnaud Lesauvage [EMAIL PROTECTED]: Arnaud Lesauvage a écrit : Brandon Aiken a écrit : It also might be a big/little endian problem, although I always thought that was platform specific, not locale specific. Try the UCS-2-INTERNAL and UCS-4-INTERNAL codepages in iconv, which

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Richard Huxton a écrit : Arnaud Lesauvage wrote: Hi list ! I already posted this as COPY FROM encoding error, but I have been doing some more tests since then. I'm trying to export data from MS SQL Server to PostgreSQL. The tables are quite big (20M rows), so a CSV export and a COPY FROM3

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Tomi NA a écrit : I think I'll go this way... No other choice, actually ! The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. I don't really understand what this is. It supports the euro symbol, so it is probably not pure LATIN1, right ? I suppose you'd have to look at the latin1 codepage

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Tomi NA
2006/11/21, Arnaud Lesauvage [EMAIL PROTECTED]: Hi list ! I already posted this as COPY FROM encoding error, but I have been doing some more tests since then. I'm trying to export data from MS SQL Server to PostgreSQL. The tables are quite big (20M rows), so a CSV export and a COPY FROM3

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Richard Huxton a écrit : Arnaud Lesauvage wrote: Richard Huxton a écrit : Or go via MS-Access/Perl and ODBC/DBI perhaps? Yes, I think it would work. The problem is that the DB is too big for this king of export. Using DTS from MSSQL to export directly to PostgreSQL using psqlODBC Unicode

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Alvaro Herrera
Arnaud Lesauvage wrote: Tomi NA a écrit : I think I'll go this way... No other choice, actually ! The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. I don't really understand what this is. It supports the euro symbol, so it is probably not pure LATIN1, right ? I suppose you'd have to

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Tomi NA
2006/11/22, Arnaud Lesauvage [EMAIL PROTECTED]: Tomi NA a écrit : 2006/11/21, Arnaud Lesauvage [EMAIL PROTECTED]: Hi list ! I already posted this as COPY FROM encoding error, but I have been doing some more tests since then. I'm trying to export data from MS SQL Server to PostgreSQL. The

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Magnus Hagander
I have done this in Delphi using it's built in UTF8 encoding and decoding routines. You can get a free copy of Delphi Turbo Explorer which includes components for MS SQL server and ODBC, so it would be pretty straight forward to get this working. The actual method in Delphi

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Magnus Hagander
I already posted this as COPY FROM encoding error, but I have been doing some more tests since then. I'm trying to export data from MS SQL Server to PostgreSQL. The tables are quite big (20M rows), so a CSV export and a COPY FROM3 import seems to be the only reasonable solution.

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Alvaro Herrera a écrit : Arnaud Lesauvage wrote: Tomi NA a écrit : I think I'll go this way... No other choice, actually ! The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. I don't really understand what this is. It supports the euro symbol, so it is probably not pure LATIN1, right ? I

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Tomi NA a écrit : 2006/11/21, Arnaud Lesauvage [EMAIL PROTECTED]: Hi list ! I already posted this as COPY FROM encoding error, but I have been doing some more tests since then. I'm trying to export data from MS SQL Server to PostgreSQL. The tables are quite big (20M rows), so a CSV export and

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Alvaro Herrera
Arnaud Lesauvage wrote: Alvaro Herrera a écrit : Arnaud Lesauvage wrote: Tomi NA a écrit : I think I'll go this way... No other choice, actually ! The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. I don't really understand what this is. It supports the euro symbol, so it is probably

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Richard Huxton
Arnaud Lesauvage wrote: Richard Huxton a écrit : Or go via MS-Access/Perl and ODBC/DBI perhaps? Yes, I think it would work. The problem is that the DB is too big for this king of export. Using DTS from MSSQL to export directly to PostgreSQL using psqlODBC Unicode Driver, I exported ~1000

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Tony Caduto
Of course, but it doesn't work !!! Whatever client encoding I choose in postgresql before COPYing, I get the 'invalid byte sequence error'. The farther I can get is exporting to UNICODE and importing as UTF8. Then COPY only breaks on the euro symbol (otherwise it breaks very early, I

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Magnus Hagander a écrit : I have done this in Delphi using it's built in UTF8 encoding and decoding routines. You can get a free copy of Delphi Turbo Explorer which includes components for MS SQL server and ODBC, so it would be pretty straight forward to get this working. The actual

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Alvaro Herrera a écrit : Arnaud Lesauvage wrote: Alvaro Herrera a écrit : Arnaud Lesauvage wrote: Tomi NA a écrit : I think I'll go this way... No other choice, actually ! The MSSQL database is in SQL_Latin1_General_CP1_Cl_AS. I don't really understand what this is. It supports the euro symbol,

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Thomas H.
Or go via MS-Access/Perl and ODBC/DBI perhaps? Yes, I think it would work. The problem is that the DB is too big for this king of export. Using DTS from MSSQL to export directly to PostgreSQL using psqlODBC Unicode Driver, I exported ~1000 rows per second in a 2-columns table with ~20M

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Alvaro Herrera
Arnaud Lesauvage wrote: mydb=# SET client_encoding TO LATIN9; SET mydb=# COPY statistiques.detailrecherche (log_gid, champrecherche, valeurrecherche) FROM 'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV; ERROR: invalid byte sequence for encoding LATIN9: 0x00 HINT: This error can

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Alvaro Herrera
Arnaud Lesauvage wrote: Alvaro Herrera a écrit : Arnaud Lesauvage wrote: mydb=# SET client_encoding TO LATIN9; SET mydb=# COPY statistiques.detailrecherche (log_gid, champrecherche, valeurrecherche) FROM 'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV; ERROR: invalid byte

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Arnaud Lesauvage
Alvaro Herrera a écrit : Arnaud Lesauvage wrote: Alvaro Herrera a écrit : Arnaud Lesauvage wrote: mydb=# SET client_encoding TO LATIN9; SET mydb=# COPY statistiques.detailrecherche (log_gid, champrecherche, valeurrecherche) FROM 'E:\\Production\\Temp\\detailrecherche_ansi.csv' CSV; ERROR:

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Bruce Momjian
Arnaud Lesauvage wrote: I thought Win1252 was supposed to be almost the same as Latin1. While I'd expect certain differences, I wouldn't expect it to use 0x00 as data! Maybe you could have DTS export Unicode, which would presumably be UTF-16, then recode that to something else

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Magnus Hagander
I thought Win1252 was supposed to be almost the same as Latin1. While I'd expect certain differences, I wouldn't expect it to use 0x00 as data! Maybe you could have DTS export Unicode, which would presumably be UTF-16, then recode that to something else (possibly UTF-8)

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Brandon Aiken
PROTECTED] On Behalf Of Arnaud Lesauvage Sent: Wednesday, November 22, 2006 12:38 PM To: Arnaud Lesauvage; General Subject: Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem Alvaro Herrera a écrit : Arnaud Lesauvage wrote: Alvaro Herrera a écrit : Arnaud Lesauvage wrote: mydb=# SET

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Tomi NA
2006/11/22, Brandon Aiken [EMAIL PROTECTED]: Gee, didn't Unicode just so simplify this codepage mess? Remember when it was just ASCII, EBCDIC, ANSI, and localized codepages? Unicode is a heaven sent, compared to 3 or 4 codepages representing any given (obviously non-English) language, and 3

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-22 Thread Martijn van Oosterhout
On Wed, Nov 22, 2006 at 01:55:55PM -0500, Brandon Aiken wrote: Gee, didn't Unicode just so simplify this codepage mess? Remember when it was just ASCII, EBCDIC, ANSI, and localized codepages? I think that's one reason why Unix has standardised on UTF-8 rather than one of the other Unicode

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-21 Thread Tony Caduto
Arnaud Lesauvage wrote: I then try to import into PostgreSQL. The farther I can get is when using the UNICODE export, and importing it using a client_encoding set to UTF8 (I tried WIN1252, LATIN9, LATIN1, ...). The copy then stops with an error : ERROR: invalid byte sequence for encoding

Re: [GENERAL] MSSQL to PostgreSQL : Encoding problem

2006-11-21 Thread Richard Huxton
Tony Caduto wrote: Arnaud Lesauvage wrote: I then try to import into PostgreSQL. The farther I can get is when using the UNICODE export, and importing it using a client_encoding set to UTF8 (I tried WIN1252, LATIN9, LATIN1, ...). The copy then stops with an error : ERROR: invalid byte