Re: [SQL] Changing character set when the damage is done

2006-12-24 Thread Alexis Paul Bertolini

Ragnar wrote:

On lau, 2006-12-23 at 00:12 +0100, Alexis Paul Bertolini wrote:
  
depends on whether all imports have been in the same encoding 
or not. SQL_ASCII basically accepts and stores the characters

without interpretation, so if all imorts were done with one
client_encoding, you should be able to read the values again
using the same encoding.

are all the instances of these 5 characters the same everywhere?
in that case there is a good chance that you can fix the values
with UPDATES

when you say 
'all accented vowels show up differently from what I expected'

do you mean in psql or some other tool?
do these show as 1 or 2 characters?

Thanks for your reply, Gnari,

all imports were done using default values, so the encoding ought to be 
the same. As proof of this, same characters show up in the same way.


They show up in PHP, PgAdminIII and psql. All the same. A lowercase e 
with a grave accent appears as a capital A with the cedilla, followed by 
an umlaut (just the umlaut, on its own). So to answer your question, 
they are two characters.


Oh... and Merry Christmas!

Alex.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [SQL] Changing character set when the damage is done

2006-12-24 Thread Michael Fuhr
On Sun, Dec 24, 2006 at 05:05:37PM +0100, Alexis Paul Bertolini wrote:
> They show up in PHP, PgAdminIII and psql. All the same. A lowercase e 
> with a grave accent appears as a capital A with the cedilla, followed by 
> an umlaut (just the umlaut, on its own). So to answer your question, 
> they are two characters.

Are you sure that's not a tilde (a wavy line above the A) instead
of a cedilla (a hook below the A)?  The UTF-8 encoding for lowercase e
with grave is 0xc3 0xa8, which in ISO-8859-1 (LATIN1) or Windows-1252
is uppercase A with tilde followed by a diaeresis (an umlaut on its
own).  Does the data appear correctly if you do either of the following?

SELECT convert(colname, 'utf8', 'latin1') FROM tablename;
SELECT convert(colname, 'utf8', 'win1252') FROM tablename;

If you use characters like "smart quotes" or the Euro sign then
you'll probably need to use win1252 instead of latin1.  Does the
following show a Euro sign or does it show blank?

SELECT convert('\342\202\254', 'utf8', 'win1252');

-- 
Michael Fuhr

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] Changing character set when the damage is done

2006-12-24 Thread Alexis Paul Bertolini

Michael Fuhr wrote:

Are you sure that's not a tilde (a wavy line above the A) instead
of a cedilla (a hook below the A)?  The UTF-8 encoding for lowercase e
with grave is 0xc3 0xa8, which in ISO-8859-1 (LATIN1) or Windows-1252
is uppercase A with tilde followed by a diaeresis (an umlaut on its
own).  Does the data appear correctly if you do either of the following?

SELECT convert(colname, 'utf8', 'latin1') FROM tablename;
  
Yep, I meant the A with the tilde (the word cedilla sounded Spanish... 
lol). Thanks, the "convert" function does the trick for me.


Using win1252 I get a "invalid destination encoding" error, but no 
problem since I know for sure there aren't any euro signs.


Thanks!!

Alex.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [SQL] Changing character set when the damage is done

2006-12-24 Thread Michael Fuhr
On Sun, Dec 24, 2006 at 06:44:04PM +0100, Alexis Paul Bertolini wrote:
> >SELECT convert(colname, 'utf8', 'latin1') FROM tablename;
>
> Yep, I meant the A with the tilde (the word cedilla sounded Spanish... 
> lol). Thanks, the "convert" function does the trick for me.

If possible consider recreating the database with an encoding other
than SQL_ASCII (e.g., LATIN1 or UTF8).  If you do that, and if you
set client_encoding appropriately when inserting and selecting data,
then the database will convert the data automatically.

> Using win1252 I get a "invalid destination encoding" error, but no 
> problem since I know for sure there aren't any euro signs.

Versions 8.1 and later support win1252 so I'd guess you're using
8.0 or earlier.  Consider upgrading to 8.1 or 8.2 for numerous
improvements over earlier releases.

http://www.postgresql.org/docs/8.1/interactive/release-8-1.html
http://www.postgresql.org/docs/8.2/interactive/release-8-2.html

-- 
Michael Fuhr

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster