subject:"\[GENERAL\] UTF\-8"

Re: [GENERAL] UTF-8 on Postgres wire protocol

2016-12-21 Thread Michael Paquier

On Thu, Dec 22, 2016 at 8:25 AM, Rui Pacheco wrote: > I’m toying around with the wire protocol and came across something I don’t > understand. > > I created a table with two columns, one called “id” and one called “señor”. > When I select from that table I get the list of columns and while its f

[GENERAL] UTF-8 on Postgres wire protocol

2016-12-21 Thread Rui Pacheco

I’m toying around with the wire protocol and came across something I don’t understand. I created a table with two columns, one called “id” and one called “señor”. When I select from that table I get the list of columns and while its fairly easy to identify the column with the name “id”, I’m not

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-20 Thread Adrian Klaver

On 02/20/2014 12:27 PM, Dev Kumkar wrote: On Fri, Feb 21, 2014 at 1:26 AM, Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: Well I dug out a Windows machine and tried to get what you wanted, to no avail. As far as I know there is no UTF8 collation, it is an encoding. What you

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-20 Thread Adrian Klaver

On 02/20/2014 11:40 AM, Dev Kumkar wrote: Hmm. Don't want to digress here and loose the topic context. Here would really appreciate if there are any suggestions for UTF-8 collation on Windows? Just had idea, not sure how feasible it is in your situation though. Run Postgres in a Linux VM on

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-20 Thread Dev Kumkar

On Fri, Feb 21, 2014 at 1:26 AM, Adrian Klaver wrote: > Well I dug out a Windows machine and tried to get what you wanted, to no > avail. As far as I know there is no UTF8 collation, it is an encoding. What > you want if I am following, is the en_US locale (or equivalent for another > language) on

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-20 Thread Adrian Klaver

On 02/20/2014 11:40 AM, Dev Kumkar wrote: Hmm. Don't want to digress here and loose the topic context. Here would really appreciate if there are any suggestions for UTF-8 collation on Windows? Well I dug out a Windows machine and tried to get what you wanted, to no avail. As far as I know t

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-20 Thread Dev Kumkar

On Fri, Feb 21, 2014 at 12:14 AM, Gavin Flower < gavinflo...@archidevsys.co.nz> wrote: > On 21/02/14 02:04, Dev Kumkar wrote: > > On Thu, Feb 20, 2014 at 3:04 AM, Gavin Flower < > gavinflo...@archidevsys.co.nz> wrote: > >> Upgrade servers to Linux? :-P >> > > Actually that's not the solution b

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-20 Thread Gavin Flower

On 21/02/14 02:04, Dev Kumkar wrote: On Thu, Feb 20, 2014 at 3:04 AM, Gavin Flower mailto:gavinflo...@archidevsys.co.nz>> wrote: Upgrade servers to Linux? :-P Actually that's not the solution but running away from it. There is a heavy footprint of customers and huge market on windows to

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-20 Thread Dev Kumkar

On Thu, Feb 20, 2014 at 3:04 AM, Gavin Flower wrote: > Upgrade servers to Linux? :-P > Actually that's not the solution but running away from it. There is a heavy footprint of customers and huge market on windows too and so not that easy to migrate and convince in market. Regards...

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-20 Thread Dev Kumkar

On Thu, Feb 20, 2014 at 4:34 PM, Daniel Verite wrote: > Despite windows-1252 being a monobyte encoding sharing most > of LATIN1 codes and character set, it does not mean that > English_United States.1252 is limited to this character set. > You may use UTF-8 databases with that locale. > > Consider

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-20 Thread Daniel Verite

Dev Kumkar wrote: > Succeeds but as replied earlier it creates database with LC_COLLATE = > 'English_United States.1252' which corresponds to Latin1. Despite windows-1252 being a monobyte encoding sharing most of LATIN1 codes and character set, it does not mean that English_United States.

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Dev Kumkar

On Thu, Feb 20, 2014 at 3:17 AM, John R Pierce wrote: > On 2/19/2014 1:35 PM, Adrian Klaver wrote: > >> >> Unfortunately this is a Windows install and that does not work either. >> > > windows encodings are a pain. their Unicode is NOT utf8, its ucs2 aka > utf16. I just checked my default ins

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread John R Pierce

On 2/19/2014 1:35 PM, Adrian Klaver wrote: Unfortunately this is a Windows install and that does not work either. windows encodings are a pain. their Unicode is NOT utf8, its ucs2 aka utf16. I just checked my default install of potsgres 9.2, it appears its using WIN1252 encoding, anothe

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Adrian Klaver

On 02/19/2014 01:30 PM, John R Pierce wrote: On 2/19/2014 1:21 PM, Dev Kumkar wrote: createdb -U postgres -E utf8 -l en-US -T template0 mynewdb Password: *createdb: database creation failed: ERROR: invalid locale name: "en-US"* I believe its en_US ... _ not - Unfortunately this is a Windows

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Gavin Flower

On 20/02/14 10:28, Adrian Klaver wrote: On 02/19/2014 01:21 PM, Dev Kumkar wrote: On Thu, Feb 20, 2014 at 2:45 AM, Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: Have you tried it? Note that the locale name is different then the one Linux. On Linux it is en_US. What

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread John R Pierce

On 2/19/2014 1:21 PM, Dev Kumkar wrote: createdb -U postgres -E utf8 -l en-US -T template0 mynewdb Password: *createdb: database creation failed: ERROR: invalid locale name: "en-US"* I believe its en_US ... _ not - -- john r pierce 37N 122W somewhere

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Adrian Klaver

On 02/19/2014 01:21 PM, Dev Kumkar wrote: On Thu, Feb 20, 2014 at 2:45 AM, Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: Have you tried it? Note that the locale name is different then the one Linux. On Linux it is en_US. What I suggested is en-US. Yes. Here is the

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Dev Kumkar

On Thu, Feb 20, 2014 at 2:45 AM, Adrian Klaver wrote: > > Have you tried it? > > Note that the locale name is different then the one Linux. > > On Linux it is en_US. > > What I suggested is en-US. > Yes. Here is the output: createdb -U postgres -E utf8 -l en-US -T template0 mynewdb Password: *cre

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Adrian Klaver

On 02/19/2014 01:09 PM, Dev Kumkar wrote: On Thu, Feb 20, 2014 at 2:24 AM, Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: Alright last shot:) Taking hint from here: http://msdn.microsoft.com/en-__us/library/x99tb11d.aspx

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Dev Kumkar

On Thu, Feb 20, 2014 at 2:24 AM, Adrian Klaver wrote: > Alright last shot:) > > Taking hint from here: > > http://msdn.microsoft.com/en-us/library/x99tb11d.aspx > > try: > > createdb -U postgres -E utf8 -l en-US > > If that does not work, not sure where to go. This won't work on Windows. Note t

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Adrian Klaver

On 02/19/2014 12:43 PM, Dev Kumkar wrote: On Thu, Feb 20, 2014 at 2:01 AM, Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: Just noticed you are not specifying the template database. Try using template0: createdb -U postgres -E utf8 --lc-ctype=american_usa --lc-collate=am

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Dev Kumkar

On Thu, Feb 20, 2014 at 2:01 AM, Adrian Klaver wrote: > Just noticed you are not specifying the template database. Try using > template0: > > createdb -U postgres -E utf8 --lc-ctype=american_usa > --lc-collate=american_usa -T template0 Same result i.e. LC_COLLATE and LC_CTYPE gets set as 'Engl

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Adrian Klaver

On 02/19/2014 12:16 PM, Dev Kumkar wrote: On Thu, Feb 20, 2014 at 1:41 AM, Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: What does it set LC_CTYPE to? So what happens if you do?: createdb -U postgres -E utf8 -l american_usa.65001 *createdb: database creation failed: ER

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Dev Kumkar

On Thu, Feb 20, 2014 at 1:41 AM, Adrian Klaver wrote: > What does it set LC_CTYPE to? > > So what happens if you do?: > > createdb -U postgres -E utf8 -l american_usa.65001 > *createdb: database creation failed: ERROR: invalid locale name: "american_usa.65001" * > or > > createdb -U postgres

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Adrian Klaver

On 02/19/2014 12:03 PM, Dev Kumkar wrote: On Thu, Feb 20, 2014 at 1:19 AM, Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: So what is the exact command you are using? createdb -U postgres -E utf8 -l american_usa Above command fails to create utf-8 LC_COLLATE. What does it set LC

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Dev Kumkar

On Thu, Feb 20, 2014 at 1:19 AM, Adrian Klaver wrote: > So what is the exact command you are using? createdb -U postgres -E utf8 -l american_usa Above command fails to create utf-8 LC_COLLATE. Regards...

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Adrian Klaver

On 02/19/2014 11:42 AM, Dev Kumkar wrote: On Wed, Feb 19, 2014 at 10:16 PM, Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: I found the below that might help. I do not use Windows much any more so I do not have a machine handy to confirm. http://www.g-loaded.eu/2011/02/27/lo

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Dev Kumkar

On Wed, Feb 19, 2014 at 10:16 PM, Adrian Klaver wrote: > I found the below that might help. I do not use Windows much any more so I > do not have a machine handy to confirm. > > http://www.g-loaded.eu/2011/02/27/locale-windows/ > Thanks for the pointer. "*american_usa*" works however it sets the

Re: [GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Adrian Klaver

On 02/19/2014 06:41 AM, Dev Kumkar wrote: Am really going no where with this after so many searching over net or am missing some basic things, not sure! What is the equivalent for "en_US.UTF-8" collation in case of windows? In Linux am creating database with following options, as follows: -E ut

[GENERAL] UTF-8 collation on Windows?

2014-02-19 Thread Dev Kumkar

Am really going no where with this after so many searching over net or am missing some basic things, not sure! What is the equivalent for "en_US.UTF-8" collation in case of windows? In Linux am creating database with following options, as follows: -E utf8 -l en_US.UTF-8 -T template0 This creates

Re: [GENERAL] UTF-8 for bytea

2011-11-03 Thread Marko Kreen

On Thu, Nov 3, 2011 at 4:34 AM, Robert James wrote: > When trying to INSERT on Postgres (9.1) to a bytea column, via E'' > escaped strings, I get the strings rejected because they're not UTF8. > I'm confused, since bytea isn't for strings but for binary. What > causes this? How do I fix this? (I

[GENERAL] UTF-8 for bytea

2011-11-02 Thread Robert James

When trying to INSERT on Postgres (9.1) to a bytea column, via E'' escaped strings, I get the strings rejected because they're not UTF8. I'm confused, since bytea isn't for strings but for binary. What causes this? How do I fix this? (I know that escaped strings is not the best way for binary data

Re: [GENERAL] UTF-8 and Regular expression

2011-05-31 Thread Tom Lane

=?ISO-8859-1?Q?H=E5vard_Wahl_Kongsg=E5rd?= writes: > Hi, in 8.4 how does the regular expression functions in postgresql handle > special UTF-8 characters? Badly :-( > for example: > SELECT name,substring(name from E'\\w+\\s(\\w+)$') from nodes; > fails to select characters like ü ø æ å Should

[GENERAL] UTF-8 and Regular expression

2011-05-31 Thread Håvard Wahl Kongsgård

Hi, in 8.4 how does the regular expression functions in postgresql handle special UTF-8 characters? for example: SELECT name,substring(name from E'\\w+\\s(\\w+)$') from nodes; fails to select characters like ü ø æ å -- Håvard Wahl Kongsgård http://havard.security-review.net/

Re: [GENERAL] =?UTF-8?Q?select_random_order_by_random?=

2007-11-01 Thread Chris Browne

[EMAIL PROTECTED] ("=?UTF-8?Q?piotr=5Fsobolewski?=") writes: > I was very surprised when I executed such SQL query (under PostgreSQL 8.2): > select random() from generate_series(1, 10) order by random(); > > I thought I would receive ten random numbers in random order. But I received > ten random

Re: [GENERAL] UTF-8 encoding problem

2007-08-16 Thread Peter Eisentraut

Am Donnerstag, 16. August 2007 15:21 schrieb bhyuan: > Maybe SQL injection-like security issues will occour, > but I find that differend version of Postgresql get different result. That just shows that some versions are more broken than others. But there was a lot of thought put into the current

[GENERAL] UTF-8 encoding

2007-08-16 Thread James B. Byrne

On Wed, August 15, 2007 21:15, Phoenix Kiula wrote: > > Thanks. Here's my locale information: > >> locale > LANG=en_US.UTF-8 > LC_CTYPE="en_US.UTF-8" > LC_NUMERIC="en_US.UTF-8" > LC_TIME="en_US.UTF-8" > LC_COLLATE="en_US.UTF-8" > LC_MONETARY="en_US.UTF-8" > LC_MESSAGES="en_US.UTF-8" > LC_PAPER="en

Re: [GENERAL] UTF-8 encoding problem

2007-08-16 Thread bhyuan

Thanks for your replay. Maybe SQL injection-like security issues will occour, but I find that differend version of Postgresql get different result. Such as the sql set client_encoding='SJIS'; select '\xc3\xaa',* from xxx; on V7.4 @RH3 got \xc3\xaa on [EMAIL PROTECTED] got (blank) on [EMA

Re: [GENERAL] UTF-8 encoding problem

2007-08-16 Thread Peter Eisentraut

Am Donnerstag, 16. August 2007 08:40 schrieb bhyuan: > Can I ignore the error message by confiing the config file? No, there are not provisions for that. Some errors of this type used to be ignored, but that led to SQL injection-like security issues, so you don't want that. -- Peter Eisentrau

[GENERAL] UTF-8 encoding problem

2007-08-15 Thread bhyuan

hi I use UTF-8 as server character encoding, and use sjis as client character encoding. For some reason, some none sjis encoding character was insert into the database. WHEN I use set client_encoding='SJIS select * from xxx I got such error message Native Error: ERROR: character 0xc2a0 of encoding

Re: [GENERAL] UTF-8 to ASCII

2007-05-11 Thread Alvaro Herrera

Tom Lane escribió: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Why on earth is it talking about MULE_INTERNAL? > > IIRC, a lot of the conversions translate through some common > intermediate charset to save on code/table space. In such cases > the problem will usually be detected on the back

Re: [GENERAL] UTF-8 to ASCII

2007-05-11 Thread Tom Lane

Alvaro Herrera <[EMAIL PROTECTED]> writes: > Why on earth is it talking about MULE_INTERNAL? IIRC, a lot of the conversions translate through some common intermediate charset to save on code/table space. In such cases the problem will usually be detected on the backend conversion...

Re: [GENERAL] UTF-8 to ASCII

2007-05-11 Thread Martin Gainty

ot; <[EMAIL PROTECTED]> Cc: Sent: Friday, May 11, 2007 9:33 AM Subject: Re: [GENERAL] UTF-8 to ASCII Martin Marques escribió: I have a doubt about the function to_ascii() and what the documentation says. Basically, I passed my DB from latin1 to UTF-8, and I started getting an error when us

Re: [GENERAL] UTF-8 to ASCII

2007-05-11 Thread Alvaro Herrera

Martin Marques escribió: > I have a doubt about the function to_ascii() and what the documentation > says. > > Basically, I passed my DB from latin1 to UTF-8, and I started getting an > error when using the to_ascii() function on a field of one of my DB [1]: > > ERROR: la conversión de codific

Re: [GENERAL] UTF-8 to ASCII

2007-05-11 Thread Martin Marques

Albe Laurenz wrote: [2]: http://www.postgresql.org/docs/8.1/interactive/functions-string.html#FTN.AEN7625 Well, the documentation for to_ascii states clearly: "The to_ascii function supports conversion from LATIN1, LATIN2, LATIN9, and WIN1250 encodings only." Sorry, didn't see the footn

Re: [GENERAL] UTF-8 to ASCII

2007-05-11 Thread Albe Laurenz

> I have a doubt about the function to_ascii() and what the > documentation says. > > Basically, I passed my DB from latin1 to UTF-8, and I started What do you mean by 'passed the DB from Latin1 to UTF8'? > getting an error when using the to_ascii() function on a field > of one of my DB [1]: >

Re: [GENERAL] UTF-8 to ASCII

2007-05-11 Thread Arnaud Lesauvage

Martin Marques a écrit : I have a doubt about the function to_ascii() and what the documentation says. Basically, I passed my DB from latin1 to UTF-8, and I started getting an error when using the to_ascii() function on a field of one of my DB [1]: ERROR: la conversión de codificación de UT

Re: [GENERAL] UTF-8 to ASCII

2007-05-11 Thread Martin Marques

LEGEAY Jérôme wrote: for convert my DB, i use this process: createdb -T "old_DB" "copy_old_DB" dropdb "old_DB" createdb -E LATIN1 -T "copy_old_DB" "new_DB_name" maybe this process will help you. As I said in my original mail, the DB conversion went OK, but I see some discrepancies in the do

Re: [GENERAL] UTF-8 to ASCII

2007-05-11 Thread LEGEAY Jérôme

for convert my DB, i use this process: createdb -T "old_DB" "copy_old_DB" dropdb "old_DB" createdb -E LATIN1 -T "copy_old_DB" "new_DB_name" maybe this process will help you. regards Jérôme LEGEAY Le 14:13 11/05/2007, vous avez écrit: I have a doubt about the function to_ascii() and what the

[GENERAL] UTF-8 to ASCII

2007-05-11 Thread Martin Marques

I have a doubt about the function to_ascii() and what the documentation says. Basically, I passed my DB from latin1 to UTF-8, and I started getting an error when using the to_ascii() function on a field of one of my DB [1]: ERROR: la conversión de codificación de UTF8 a ASCII no está soporta

Re: [GENERAL] UTF-8

2006-10-13 Thread Martins Mihailovs

Martijn van Oosterhout wrote: On Thu, Oct 12, 2006 at 11:09:53PM +0200, Tomi NA wrote: 2006/10/12, Martijn van Oosterhout : On Tue, Oct 10, 2006 at 11:49:06AM +0300, Martins Mihailovs wrote: There are some misunderstood. Im using Linux 2.6.16.4, postgresql 8.1.4, (there are one of locale: lv

Re: [GENERAL] UTF-8

2006-10-13 Thread Tom Lane

Martijn van Oosterhout writes: > Characters havn't fitted in an unsigned char in a very long time. It's > obviously bogus for any multibyte encoding (the code even says so). For > such encodings you could use the system's towupper() (ANSI C/Unix98) > which will work on any unicode char. http://de

Re: [GENERAL] UTF-8

2006-10-13 Thread Martijn van Oosterhout

On Fri, Oct 13, 2006 at 12:04:02PM -0400, Tom Lane wrote: > "Tomi NA" <[EMAIL PROTECTED]> writes: > > 2006/10/13, Martijn van Oosterhout : > >> Similarly, upper/lower are also supported, although postgresql doesn't > >> take advantage of the system support in that case. > > > I think this is the c

Re: [GENERAL] UTF-8

2006-10-13 Thread Tom Lane

"Tomi NA" <[EMAIL PROTECTED]> writes: > 2006/10/13, Martijn van Oosterhout : >> Similarly, upper/lower are also supported, although postgresql doesn't >> take advantage of the system support in that case. > I think this is the crux of the problem. If it were true, then it might be ...

Re: [GENERAL] UTF-8

2006-10-13 Thread Tomi NA

2006/10/13, Martijn van Oosterhout : While sorting for multiple languages simultaneously is an issue, that's not the problem here. Linux/GLibc *does* support correct sorting for all language/charset combinations, and that's what he's using. Just for the hell of it I setup lv_LV.utf8 on my laptop

Re: [GENERAL] UTF-8

2006-10-13 Thread Martijn van Oosterhout

On Fri, Oct 13, 2006 at 03:40:17PM +0200, Tomi NA wrote: > This is a reoccurring topic on the list: sure, it's possible to > misconfigure pg so that uppercase/lowercase/ilike/tsearch2/order don't > work with a single letter outside of the English alphabet, but the > problem Martins seems to be faci

Re: [GENERAL] UTF-8

2006-10-13 Thread Tomi NA

2006/10/13, Martijn van Oosterhout : On Thu, Oct 12, 2006 at 11:09:53PM +0200, Tomi NA wrote: > 2006/10/12, Martijn van Oosterhout : > >On Tue, Oct 10, 2006 at 11:49:06AM +0300, Martins Mihailovs wrote: > >> There are some misunderstood. Im using Linux 2.6.16.4, postgresql 8.1.4, > >> (there are

Re: [GENERAL] UTF-8

2006-10-13 Thread Martijn van Oosterhout

On Thu, Oct 12, 2006 at 11:09:53PM +0200, Tomi NA wrote: > 2006/10/12, Martijn van Oosterhout : > >On Tue, Oct 10, 2006 at 11:49:06AM +0300, Martins Mihailovs wrote: > >> There are some misunderstood. Im using Linux 2.6.16.4, postgresql 8.1.4, > >> (there are one of locale: lv_LV.utf8, for Latvia

Re: [GENERAL] UTF-8

2006-10-12 Thread Tomi NA

2006/10/12, Martijn van Oosterhout : On Tue, Oct 10, 2006 at 11:49:06AM +0300, Martins Mihailovs wrote: > There are some misunderstood. Im using Linux 2.6.16.4, postgresql 8.1.4, > (there are one of locale: lv_LV.utf8, for Latvian language). But if I > want do "lower", then with standard latin

Re: [GENERAL] UTF-8

2006-10-12 Thread Martijn van Oosterhout

On Tue, Oct 10, 2006 at 11:49:06AM +0300, Martins Mihailovs wrote: > There are some misunderstood. Im using Linux 2.6.16.4, postgresql 8.1.4, > (there are one of locale: lv_LV.utf8, for Latvian language). But if I > want do "lower", then with standard latin symbols all is ok, but with > others

Re: [GENERAL] UTF-8

2006-10-11 Thread Martins Mihailovs

Martijn van Oosterhout wrote: On Fri, Oct 06, 2006 at 12:44:43PM +0300, Martins Mihailovs wrote: I would be a glad to hear your solutions, experience in web application with multi languages (searching with indexing, sorting and others problems with multi byte encoding). For developers: what a

Re: [GENERAL] UTF-8

2006-10-09 Thread Martijn van Oosterhout

On Fri, Oct 06, 2006 at 12:44:43PM +0300, Martins Mihailovs wrote: > I would be a glad to hear your solutions, experience in web application > with multi languages (searching with indexing, sorting and others > problems with multi byte encoding). > > For developers: what are your future plans ab

[GENERAL] UTF-8

2006-10-06 Thread Martins Mihailovs

Hello! I'm using PgSQL for a 3 years for web applications, but not only. But the main problem is in encoding. My web applications are used by international (mostly 3 languages: latvian (LATIN7), english and russian). The best (mostly) solution is to use UTF-8, but there are a lot of problems.

Re: [GENERAL] UTF-8, upper() and Chinese characters yielding blank result

2006-07-27 Thread Martijn van Oosterhout

On Thu, Jul 27, 2006 at 07:22:17PM +0200, Peter Eisentraut wrote: > Scott Eade wrote: > > The problem appears on PostgreSQL 8.0.7 (on WinXP) > > PostgreSQL 8.0 on Windows does not support UTF-8. In addition, PostgreSQL is totally reliant on the OS for upper/lower/collation support, so there is no

Re: [GENERAL] UTF-8, upper() and Chinese characters yielding blank result

2006-07-27 Thread Peter Eisentraut

Scott Eade wrote: > The problem appears on PostgreSQL 8.0.7 (on WinXP) PostgreSQL 8.0 on Windows does not support UTF-8. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend

[GENERAL] UTF-8, upper() and Chinese characters yielding blank result

2006-07-27 Thread Scott Eade

While I could see various multibyte issues in the archives and in the TODO list, I couldn't spot this exact issue: I am working with a database that uses UNICODE encoding. I have a varchar column (col_x) that includes a mix of Chinese and regular ASCII characters. On PostgreSQL 7.4.13 (on RH

[GENERAL] UTF-8 and stripping accents

2006-06-15 Thread Christopher Murtagh

Greetings folks, I'm trying to write a stored procedure that strips accents from UTF-8 encoded text. I saw a thread on this list discussing something very similar to this on April 8th, and used it to start. However, I'm getting odd behaviour. My stored procedure: CREATE OR REPLACE FUNCTION stri

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-06-01 Thread Rafal Pietrak

On Thu, 2006-06-01 at 02:00 +, Greg Sabino Mullane wrote: > #!perl > > package testone; > use DBI; > > printf "SQL_INTEGER is %d\n", SQL_INTEGER; > > package testtwo; > use DBI qw(:sql_types); > > printf "SQL_INTEGER is %d\n", SQL_INTEGER; But this is not as bad as having to "use DBD:Pg" (

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-31 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Rafal Pietrak asked: > 2. I admitt, that I should have spotted myself, that the > DBD::Pg::PG_BYTEA might not have been recognized without the use > clausure, but the driver itself understands prity much of the > underlaying datatypes - I fon't need

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-31 Thread Martijn van Oosterhout

On Wed, May 31, 2006 at 11:31:28AM +0200, Daniel Verite wrote: > Martijn van Oosterhout wrote: > > > However, there is a solution: send the paramters seperate from the > > query. In fact, postgres has been able to do that for a while now but > > not all interfaces have been made to use it. M

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-31 Thread Daniel Verite

Martijn van Oosterhout wrote: > However, there is a solution: send the paramters seperate from the > query. In fact, postgres has been able to do that for a while now but > not all interfaces have been made to use it. My guess is that those > other databases you've used were already doing

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-31 Thread Rafal Pietrak

On Tue, 2006-05-30 at 22:47 +0200, Martijn van Oosterhout wrote: > That's why bytea need special encoding to get around this check. But may be you would know, why I should write: { pg_type => DBD::Pg::PG_BYTEA } instead of possibly more generic: { TYPE => SQL_BINARY } The later

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-30 Thread Martijn van Oosterhout

On Tue, May 30, 2006 at 10:26:31PM +0200, Rafal Pietrak wrote: > Now, this is probably not exactly the furum to discuss that, but: > 1. I did quite a few scripts with DBI, not only for Postgesql in fact - > scripts worked flowlessly between Oracle/Sybase and the old DBASE files, > too. And I have

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-30 Thread Rafal Pietrak

On Tue, 2006-05-30 at 20:12 +0200, Daniel Verite wrote: > Rafal Pietrak wrote: > > Hmmm, despite initial euphoria, this doesn't actually work. > > Just an idea: make sure DBD::Pg::PG_BYTEA is defined. > If not, you're just lacking a "use DBD::Pg;" and the result :) This time it's a hit. The

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-30 Thread Daniel Verite

Rafal Pietrak wrote: > On Mon, 2006-05-29 at 14:01 +0200, Martijn van Oosterhout wrote: > > > > > > How come the bytearea is *interpreted* as having encoding? > > > > Actually, it's not the bytea type that is being interpreted, it's the > > string you're sending to the server that is. Be

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-30 Thread SCassidy

y: Subject: Re: [GENERAL]

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-30 Thread Rafal Pietrak

On Tue, 2006-05-30 at 09:05 -0700, [EMAIL PROTECTED] wrote: > Did you try escaping the data: > my $rc=$sth->bind_param(1, escape_bytea($imgdata), { pg_type => > DBD::Pg::PG_BYTEA }); No. But: $ ./test Undefined subroutine &main::escape_bytea called at ./test line 34. Where can I find on

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-30 Thread SCassidy

cc: Sent by: Subject: Re: [GENERAL] UTF-8 context of BY

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-29 Thread Rafal Pietrak

On Mon, 2006-05-29 at 14:01 +0200, Martijn van Oosterhout wrote: > > > > How come the bytearea is *interpreted* as having encoding? > > Actually, it's not the bytea type that is being interpreted, it's the > string you're sending to the server that is. Before you send bytea data > in a query stri

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-29 Thread Martijn van Oosterhout

On Mon, May 29, 2006 at 01:35:58PM +0200, Rafal Pietrak wrote: > The table is originally initialized with a set of IDs. Then I'm using > perl-script to insert apropriate images by means of UPDATEing rows: > --within my script called 'job'--- > my $db = DBI->connect

Re: [GENERAL] UTF-8 context of BYTEA datatype??

2006-05-29 Thread Peter Eisentraut

Am Montag, 29. Mai 2006 13:35 schrieb Rafal Pietrak: > How come the bytearea is *interpreted* as having encoding? If you pass data in text mode, all data is subject to encoding handling. If you don't want that, you need to use the binary mode. > Or to put it the other way around: What column da

[GENERAL] UTF-8 context of BYTEA datatype??

2006-05-29 Thread Rafal Pietrak

Hi! Within a UTF-8 encoded database, I have a table: CREATE TABLE pics (id serial not null unique, img bytea); The table is originally initialized with a set of IDs. Then I'm using perl-script to insert apropriate images by means of UPDATEing rows: --within my script called 'j

Re: [GENERAL] utf-8 and cultural sensitive sorting

2005-07-12 Thread Tatsuo Ishii

> It depends what language you want to sort. Lots of languages do not > have a sort alphabet. For example, Japanese. It can be quite > difficult to sort unusual languages like this. I am not aware of any > standard technique for sorting Japanese text other than keeping an > arbitrarily sort

Re: [GENERAL] utf-8 and cultural sensitive sorting

2005-07-12 Thread Alex Stapleton

It depends what language you want to sort. Lots of languages do not have a sort alphabet. For example, Japanese. It can be quite difficult to sort unusual languages like this. I am not aware of any standard technique for sorting Japanese text other than keeping an arbitrarily sorted diction

Re: [GENERAL] utf-8 and cultural sensitive sorting

2005-07-12 Thread Richard Huxton

[EMAIL PROTECTED] wrote: Our product will be storing its character data in utf-8 format (unicode encoding). What is the best way to achive cultural sensitive sorting using the utf-8 data? See below. Is it possible have the locale apply to a connection? A locale applies to a whole databas

[GENERAL] utf-8 and cultural sensitive sorting

2005-07-12 Thread sknipe

Our product will be storing its character data in utf-8 format (unicode encoding). What is the best way to achive cultural sensitive sorting using the utf-8 data? Is it possible have the locale apply to a connection? If so, is the cultural sorting support mature in PostgreSQL? What type o

[GENERAL] UTF-8 and LC_CTYPE locale

2005-05-16 Thread Stefan Hans

Hi *, we are using PostgreSQL for data in different languages like English, German and French. The encoding and locale parameters on our OS (UTF-8 and en_US.UTF-8) had problems e.g. with german umlaut. After some tries we found encoding and locale parameters (LATIN1 and de_DE.iso88591)

Re: [GENERAL] UTF-8 and =, LIKE problems

2004-11-03 Thread Michael Glaesemann

On Nov 4, 2004, at 1:24 PM, Edmund Lian wrote: I am running a web-based accounting package (SQL-Ledger) that supports multiple languages on PostgreSQL. When a database encoding is set to Unicode, multilingual operation is possible. Semantically, one might expect U+FF17 U+FF19 to be identical to

[GENERAL] UTF-8 and =, LIKE problems

2004-11-03 Thread Edmund Lian

I am running a web-based accounting package (SQL-Ledger) that supports multiple languages on PostgreSQL. When a database encoding is set to Unicode, multilingual operation is possible. However, when a user's input language is set to say English, and the user enters data such as "79", the data t

Re: [GENERAL] UTF-8 -> ISO8859-1 conversion problem

2004-10-30 Thread Cott Lang

Thanks for the detailed reply, you've confirmed what I suspected. :) I guess I have some work to do! On Fri, 2004-10-29 at 10:19, J. Michael Crawford wrote: >In my experience, there are just some characters that don't want to be > converted, even if they appear to be part of the normal 8-bi

Re: [GENERAL] UTF-8 -> ISO8859-1 conversion problem

2004-10-29 Thread Ian Pilcher

Cott Lang wrote: ERROR: could not convert UTF-8 character 0x00ef to ISO8859-1 Running 7.4.5, I frequently get this error, and ONLY on this particular character despite seeing quite a bit of 8 bit. I don't really follow why it can't be converted, it's the same character (239) in both character sets.

Re: [GENERAL] UTF-8 -> ISO8859-1 conversion problem

2004-10-29 Thread J. Michael Crawford

Correction: Four things that need to be done, THREE if you're not serving up html. Sorry for the editing error. - Mike At 01:19 PM 10/29/2004, J. Michael Crawford wrote: In my experience, there are just some characters that don't want to be converted, even if they appear to be part

Re: [GENERAL] UTF-8 -> ISO8859-1 conversion problem

2004-10-29 Thread J. Michael Crawford

In my experience, there are just some characters that don't want to be converted, even if they appear to be part of the normal 8-bit character system. We went to Unicode databases to hold our Latin1 characters because of this. There was even a case where the client was cutting and pasting a

[GENERAL] UTF-8 -> ISO8859-1 conversion problem

2004-10-29 Thread Cott Lang

ERROR: could not convert UTF-8 character 0x00ef to ISO8859-1 Running 7.4.5, I frequently get this error, and ONLY on this particular character despite seeing quite a bit of 8 bit. I don't really follow why it can't be converted, it's the same character (239) in both character sets. Databases are i

Re: [GENERAL] UTF-8 question.

2004-09-16 Thread Pierre-Frédéric Caillaud

=> show client_encoding ; client_encoding - UNICODE (1 ligne) => select char_length('a'), bit_length('a'); char_length | bit_length -+ 1 | 8 (1 ligne) # that's an accented "e" => select char_length('é'), bit_length('é'); ; char_length

Re: [GENERAL] UTF-8 question.

2004-09-16 Thread Tom Lane

"Richard Connamacher" <[EMAIL PROTECTED]> writes: > 7.1 may be prehistoric, but it's running on an off-site server that I'm > renting, and this version came pre-installed. Since it's already there > and working, I'd like to get familiar with it before I try to reinstall > a newer version. I doubt I

Re: [GENERAL] UTF-8 question.

2004-09-16 Thread Richard Connamacher

Thanks to both Dan Sugalski and Michael Glaesemann for answering my question. I probably should have realized that, while Latin letters are one byte, the fact that others are encoded into up to 5-byte groups qualifies it as a multi-byte encoding. I don't anticipate having very many non-latin letter

Re: [GENERAL] UTF-8 question.

2004-09-16 Thread Michael Glaesemann

On Sep 17, 2004, at 9:39 AM, Richard Connamacher wrote: UTF-8 is the 8-bit version of Unicode. The multibyte version of Unicode is UTF-16. UTF-8 encodes characters with varying numbers of bytes, not just 1 byte per character. IIRC, it's anywhere from 1 to 5 bytes, actually. PostgreSQL uses UTF-8.

Re: [GENERAL] UTF-8 question.

2004-09-16 Thread Dan Sugalski

At 8:39 PM -0400 9/16/04, Richard Connamacher wrote: I'm new to PostgreSQL, and from the looks of it, it's a great database, and I'll be using more of it in the future. I had a quick question if anyone could clear this up. The documentation for PostgreSQL (version 7.1, the version this server is us

[GENERAL] UTF-8 question.

2004-09-16 Thread Richard Connamacher

I'm new to PostgreSQL, and from the looks of it, it's a great database, and I'll be using more of it in the future. I had a quick question if anyone could clear this up. The documentation for PostgreSQL (version 7.1, the version this server is using) says that it supports multibyte character encod

1 2 >

1 - 100 of 122 matches

Mail list logo