Re: [GENERAL] text and bytea

2008-03-03 Thread Bruce Momjian
Tom Lane wrote: hernan gonzalez [EMAIL PROTECTED] writes: test=# create view vchartest as select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest; Hmm. This isn't a very sensible combination that you've written here, but I see the point: encode(..., 'escape') is broken in

Re: [GENERAL] text and bytea

2008-02-25 Thread hernan gonzalez
Umm, I think all you showed was that the to_ascii() function was broken. Postgres knows exactly what encoding the string is in, the backend encoding: in your case UTF-8. That would be fine, if it were true; then, one could assume that every postgresql function that returns a text gets

Re: [GENERAL] text and bytea

2008-02-25 Thread Gregory Stark
hernan gonzalez [EMAIL PROTECTED] writes: IMHO, the semantics of encode() and decode() are correct (the bridge between bytea and text ... in the backend encoding; they should be the only bridge), convert() is also ok (deals with bytes), but convert_to() and convert_from() are dubious if not

Re: [GENERAL] text and bytea

2008-02-25 Thread hernan gonzalez
IMHO, the semantics of encode() and decode() are correct (the bridge between bytea and text ... in the backend encoding; they should be the only bridge), convert() is also ok (deals with bytes), but convert_to() and convert_from() are dubious if not broken: they imply texts in arbitrary

Re: [GENERAL] text and bytea

2008-02-25 Thread hernan gonzalez
Another example (Psotgresql 8.3.0, UTF-8 server/client encoding) test=# create table chartest ( c text); test=# insert into chartest (c) values ('¡Hasta mañana!'); test=# create view vchartest as select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest; test=# select

Re: [GENERAL] text and bytea

2008-02-25 Thread Tom Lane
hernan gonzalez [EMAIL PROTECTED] writes: The objetionable ones IMHO are decode()/encode(), which can consume/produce a non-utf8 string (I mean, not the backend encoding) Huh? Those deal with bytea too --- in fact, they've got nothing at all to do with multibyte character representations.

Re: [GENERAL] text and bytea

2008-02-25 Thread Tom Lane
hernan gonzalez [EMAIL PROTECTED] writes: test=# create view vchartest as select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest; Hmm. This isn't a very sensible combination that you've written here, but I see the point: encode(..., 'escape') is broken in that it fails to convert

Re: [GENERAL] text and bytea

2008-02-24 Thread hernan gonzalez
It seems to me that postgres is trying to do as you suggest: text is characters and bytea is bytes, like in Java. But the big difference is that, for text type, postgresql knows this is a text but doesnt know the encoding, as my example showed. This goes against the concept of text vs bytes

Re: [GENERAL] text and bytea

2008-02-24 Thread Martijn van Oosterhout
On Fri, Feb 22, 2008 at 01:54:46PM -0200, hernan gonzalez wrote: It seems to me that postgres is trying to do as you suggest: text is characters and bytea is bytes, like in Java. But the big difference is that, for text type, postgresql knows this is a text but doesnt know the encoding,

Re: [GENERAL] text and bytea

2008-02-22 Thread Martijn van Oosterhout
On Thu, Feb 21, 2008 at 02:34:15PM -0200, hernan gonzalez wrote: (After dealing a while with this, and learning a little, I though of post this as comment in the docs, but perhaps someone who knows better can correct or clarify) It seems to me that postgres is trying to do as you suggest: text

Re: [GENERAL] text and bytea

2008-02-22 Thread Alvaro Herrera
Martijn van Oosterhout escribió: The most surprising this is that to_ascii won't accept a bytea. TBH the whole to_ascii function seems somewhat half-baked. If what you're trying to do is remove accents, there are perl functions around that do that. Basically, the switch to a different normal

[GENERAL] text and bytea

2008-02-21 Thread hernan gonzalez
(After dealing a while with this, and learning a little, I though of post this as comment in the docs, but perhaps someone who knows better can correct or clarify) = The issues of charset