I have updated the encode() documentation to not mention "ASCII", and to
be more specific about what 'escape' does.  Backpatched to 8.2.X.

---------------------------------------------------------------------------

Michael Fuhr wrote:
> On Thu, Jan 25, 2007 at 02:28:38PM -0500, Michael Artz wrote:
> > Perhaps my understanding of the 'encode' function is incorrect, but I
> > was under the impression that I could do something like:
> > 
> > SELECT lower(encode(bytes, 'escape')) FROM mytable;
> > 
> > as it sounded like (from the manual) that 'encode' would return valid
> > ASCII, with all the non-ascii bytes hex escaped.
> 
> The documentation for encode() does give that impression: "Encode
> binary string to ASCII-only representation.  Supported types are:
> base64, hex, escape."  However, the source code for esc_encode()
> in src/backend/utils/adt/encode.c says and does otherwise:
> 
>  * Only two characters are escaped:
>  * \0 (null) and \\ (backslash)
> 
> > When I have the byte 0x8a, however, I get the error:
> > 
> > ERROR:  invalid byte sequence for encoding "UTF8": 0x8a
> 
> Since encode() returns text and doesn't escape non-ASCII characters,
> all of the original binary data will be treated as though it's text
> in the database's encoding.  If the data contains byte sequences
> that aren't valid in that encoding then you get the above error.
> 
> > I have the sneaking suspicion that I am missing something, so please
> > correct me if I am wrong.  If I am wrong, is there a better way to
> > lowercase all the ascii characters in a bytea string?
> 
> What are you trying to do?  What is the binary data and why are you
> treating it (or part of it) as though it's text?  Do you want the
> end result to be text with escape sequences or do you want to convert
> it back to bytea?
> 
> Something like this might work:
> 
> SELECT lower(textin(byteaout(bytes))) FROM mytable;
> 
> To turn the result back into bytea:
> 
> SELECT decode(lower(textin(byteaout(bytes))), 'escape') FROM mytable;
> 
> -- 
> Michael Fuhr
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/func.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v
retrieving revision 1.368
diff -c -c -r1.368 func.sgml
*** doc/src/sgml/func.sgml	20 Feb 2007 18:18:05 -0000	1.368
--- doc/src/sgml/func.sgml	20 Feb 2007 19:48:37 -0000
***************
*** 1356,1363 ****
         </entry>
         <entry><type>text</type></entry>
         <entry>
!         Encode binary data to <acronym>ASCII</acronym>-only representation.  Supported
          types are: <literal>base64</>, <literal>hex</>, <literal>escape</>.
         </entry>
         <entry><literal>encode( E'123\\000\\001', 'base64')</literal></entry>
         <entry><literal>MTIzAAE=</literal></entry>
--- 1356,1365 ----
         </entry>
         <entry><type>text</type></entry>
         <entry>
!         Encode binary data to different representation.  Supported
          types are: <literal>base64</>, <literal>hex</>, <literal>escape</>.
+         <literal>Escape</> merely outputs null bytes as <literal>\000</> and
+         doubles backslashes.
         </entry>
         <entry><literal>encode( E'123\\000\\001', 'base64')</literal></entry>
         <entry><literal>MTIzAAE=</literal></entry>
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to